University Press of Florida
Introduction to Statistical Thought
Buy This Book ( Related Link )
CITATION PDF VIEWER
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/AA00011708/00001
 Material Information
Title: Introduction to Statistical Thought
Physical Description: Book
Language: en-US
Creator: Lavine, Michael
Publication Date: 8/3/2008
 Subjects
Subjects / Keywords: statistics, probability, modes of inference, regression, special distributions, Bayesian statistics, OGT+ isbn: 9781616100483
Bayesian Statistics, Mathematics, Probability, Regression (Statistics), Statistical Distributions, Statistics
Mathematics / Probability, Mathematics / Statistics
 Notes
Abstract: This free PDF textbook is intended as an upper level undergraduate or introductory graduate textbook in statistical thinking. It is best suited to students with a good knowledge of calculus and the ability to think abstractly. The focus of the text is the ideas that statisticians care about as opposed to technical details of how to put those ideas into practice. Another unusual aspect is the use of statistical software as a pedagogical tool. That is, instead of viewing the computer merely as a convenient and accurate calculating device, the book uses computer calculation and simulation as another way of explaining and helping readers understand the underlying concepts. The book is written with the statistical language R embedded throughout. R software and accompanying manuals are available for free download from http://www.r-project.org
General Note: Expositive
General Note: Community College, Higher Education
General Note: http://www.ogtp-cart.com/product.aspx?ISBN=9781616100483
General Note: Adobe PDF Reader
General Note: Michael L. Lavine
General Note: Textbook
General Note: lavine@math.umass.edu
General Note: http://florida.theorangegrove.org/og/file/d1462a9c-bc1e-4a1e-67de-90c5a8ed39f5/1/StatisticalThought.pdf
 Record Information
Source Institution: University of Florida
Rights Management: Copyright 2005 by Michael Lavine. Available for free download and more information at: http://www.math.umass.edu/~lavine/Book/book.html
Resource Identifier: isbn - 9781616100483
System ID: AA00011708:00001

Downloads

This item is only available as the following downloads:

( PDF )


Full Text

PAGE 1

IntroductiontoStatisticalThought MichaelLavine August3,2008

PAGE 2

i Copyright c 2005byMichaelLavine

PAGE 3

C ONTENTS ListofFiguresvi ListofTablesxi Prefacexii 1Probability1 1.1BasicProbability.............................1 1.2ProbabilityDensities...........................6 1.3ParametricFamiliesofDistributions...................14 1.3.1TheBinomialDistribution....................14 1.3.2ThePoissonDistribution.....................17 1.3.3TheExponentialDistribution..................20 1.3.4TheNormalDistribution.....................22 1.4Centers,Spreads,Means,andMoments................29 1.5Joint,MarginalandConditionalProbability..............40 1.6Association,Dependence,Independence................51 1.7Simulation................................57 1.7.1CalculatingProbabilities.....................57 1.7.2EvaluatingStatisticalProcedures................61 1.8 R ......................................72 1.9SomeResultsforLargeSamples.....................77 1.10Exercises..................................81 2ModesofInference93 2.1Data....................................93 2.2DataDescription.............................94 2.2.1SummaryStatistics........................95 ii

PAGE 4

CONTENTS iii 2.2.2DisplayingDistributions.....................100 2.2.3ExploringRelationships.....................113 2.3Likelihood.................................132 2.3.1TheLikelihoodFunction.....................132 2.3.2LikelihoodsfromtheCentralLimitTheorem..........139 2.3.3Likelihoodsforseveralparameters...............144 2.4Estimation.................................154 2.4.1TheMaximumLikelihoodEstimate...............154 2.4.2AccuracyofEstimation......................155 2.4.3Thesamplingdistributionofanestimator...........158 2.5BayesianInference............................164 2.6Prediction.................................174 2.7HypothesisTesting............................178 2.8Exercises..................................192 3Regression202 3.1Introduction................................202 3.2NormalLinearModels..........................210 3.2.1Introduction...........................210 3.2.2InferenceforLinearModels...................221 3.3GeneralizedLinearModels........................236 3.3.1LogisticRegression........................236 3.3.2PoissonRegression........................245 3.4PredictionsfromRegression.......................250 3.5Exercises..................................253 4MoreProbability263 4.1MoreProbabilityDensity.........................263 4.2RandomVectors..............................264 4.2.1DensitiesofRandomVectors...................265 4.2.2MomentsofRandomVectors..................266 4.2.3FunctionsofRandomVectors..................266 4.3RepresentingDistributions........................271 4.4Exercises..................................276 5SpecialDistributions279 5.1BinomialandNegativeBinomial....................279 5.2Multinomial................................290 5.3Poisson..................................292

PAGE 5

CONTENTS iv 5.4Uniform..................................303 5.5Gamma,Exponential,ChiSquare....................305 5.6Beta....................................313 5.7Normal..................................316 5.7.1TheUnivariateNormalDistribution...............316 5.7.2TheMultivariateNormalDistribution..............320 5.8 t and F ..................................329 5.8.1The t distribution.........................329 5.8.2The F distribution........................335 5.9Exercises..................................335 6BayesianStatistics343 6.1MultidimensionalBayesianAnalysis...................343 6.2Metropolis,Metropolis-Hastings,andGibbs..............351 6.3Exercises..................................370 7MoreModels373 7.1HierarchicalModels...........................373 7.2TimeSeriesandMarkovChains.....................374 7.3ContingencyTables............................388 7.4Survivalanalysis.............................388 7.5ThePoissonprocess...........................395 7.6Changepointmodels...........................395 7.7Spatialmodels..............................396 7.8PointProcessModels...........................396 7.9Evaluatingandenhancingmodels....................396 7.10Exercises..................................396 8MathematicalStatistics399 8.1PropertiesofStatistics..........................399 8.1.1Sufciency............................399 8.1.2Consistency,Bias,andMean-squaredError...........402 8.1.3Efciency.............................404 8.1.4AsymptoticNormality......................404 8.1.5Robustness............................404 8.2TransformationsofParameters.....................404 8.3Information................................404 8.4MoreHypothesisTesting.........................404 8.4.1pvalues..............................405

PAGE 6

CONTENTS v 8.4.2TheLikelihoodRatioTest....................405 8.4.3TheChiSquareTest.......................405 8.4.4Power...............................405 8.5Exponentialfamilies...........................405 8.6LocationandScaleFamilies.......................405 8.7Functionals................................405 8.8Invariance.................................405 8.9Asymptotics................................405 8.10Exercises..................................410 Bibliography414

PAGE 7

L ISTOF F IGURES 1.1pdffortimeonholdatHelpLine....................7 1.2 p Y fortheoutcomeofaspinner.....................9 1.3 a :Oceantemperatures; b :Importantdiscoveries.........11 1.4Changeofvariables............................13 1.5Binomialprobabilities..........................16 1.6 P[ X =3 j ] asafunctionof ......................19 1.7Exponentialdensities...........................21 1.8Normaldensities.............................24 1.9Oceantemperaturesat 45 N ; 30 W,1000mdepth...........25 1.10NormalsamplesandNormaldensities.................27 1.11hydrographicstationsoffthecoastofEuropeandAfrica.......31 1.12Watertemperatures............................32 1.13Twopdf'swith 1 and 2 SD's......................37 1.14Watertemperatureswithstandarddeviations.............41 1.15Permissiblevaluesof N and X ......................44 1.16Featuresofthejointdistributionof X;Y ...............48 1.17Lengthsandwidthsofsepalsandpetalsof150irisplants.......52 1.18correlations................................55 1.191000simulationsof ^ for n.sim =50,200,1000...........60 1.201000simulationsof ^ underthreeprocedures.............64 1.21MonthlyconcentrationsofCO 2 atMaunaLoa.............66 1.221000simulationsofaFACEexperiment.................69 1.23Histogramsofcrapssimulations.....................82 2.1quantiles..................................97 2.2Histogramsoftoothgrowth.......................101 2.3Histogramsoftoothgrowth.......................102 vi

PAGE 8

LISTOFFIGURES vii 2.4Histogramsoftoothgrowth.......................103 2.5caloriecontentsofbeefhotdogs....................107 2.6Stripchartoftoothgrowth........................110 2.7QuizscoresfromStatistics103.....................112 2.8QQplotsofwatertemperatures Cat1000mdepth.........114 2.9Mosaicplotof UCBAdmissions ......................118 2.10Mosaicplotof UCBAdmissions ......................119 2.11OldFaithfuldata..............................122 2.12WaitingtimeversusdurationintheOldFaithfuldataset.......123 2.13TimeseriesofdurationandwaitingtimeatOldFaithful.......124 2.14TimeseriesofdurationandwaitingtimeatOldFaithful.......125 2.15Temperatureversuslatitudefordifferentvaluesoflongitude.....128 2.16Temperatureversuslongitudefordifferentvaluesoflatitude.....129 2.17Spiketrainfromaneuronduringatasteexperiment.Thedotsshow thetimesatwhichtheneuronred.Thesolidlinesshowtimesat whichtheratreceivedadropofa.3MsolutionofNaCl........130 2.18Likelihoodfunctionfortheproportionofredcars...........134 2.19 ` after P y i =40 in60quadrats....................137 2.20LikelihoodforSlaterSchool.......................138 2.21MarginalandexactlikelihoodsforSlaterSchool............141 2.22MarginallikelihoodformeanCEOsalary................143 2.23FACEExperiment:dataandlikelihood.................146 2.24LikelihoodfunctionforQuizScores...................149 2.25Logofthelikelihoodfunctionfor ; f inExample2.13.......152 2.26Likelihoodfunctionfortheprobabilityofwinningcraps.......157 2.27Samplingdistributionofthesamplemeanandmedian........160 2.28HistogramsofthesamplemeanforsamplesfromBin n;: 1 ......162 2.29Prior,likelihoodandposteriorintheseedlingsexample........169 2.30Prior,likelihoodandposteriordensitiesfor with n =1 ; 4 ; 16 ....171 2.31Prior,likelihoodandposteriordensitiesfor with n =60 .......172 2.32Prior,likelihoodandposteriordensityforSlaterSchool........173 2.33Plug-inpredictivedistributionforseedlings..............176 2.34Predictivedistributionsforseedlingsafter n =0 ; 1 ; 60 .........179 2.35pdfoftheBin ;: 5 distribution....................184 2.36pdfsoftheBin ;: 5 dotsandN ; 5 linedistributions....185 2.37Approximatedensityofsummarystatistic t ...............186 2.38Numberoftimesbaboonfatherhelpsownchild............190 2.39Histogramofsimulatedvaluesofw.tot.................191

PAGE 9

LISTOFFIGURES viii 3.1Fourregressionexamples........................203 3.21970draftlottery.Draftnumbervs.dayofyear............206 3.3Draftnumbervs.dayofyearwithsmoothers..............207 3.4TotalnumberofNewseedlings19931997,byquadrat........209 3.5Caloriecontentofhotdogs.......................211 3.6Densityestimatesofcaloriecontentsofhotdogs............213 3.7The PlantGrowth data..........................215 3.8Icecreamconsumptionversusmeantemperature...........222 3.9Likelihoodfunctionsfor ; M ; P intheHotDogexample......228 3.10 pairs plotofthe mtcars data......................230 3.11 mtcars variousplots..........................233 3.12likelihoodfunctionsfor 1 1 1 and 2 inthe mtcars example....235 3.13PineconesandO-rings..........................238 3.14PineconesandO-ringswithregressioncurves.............239 3.15Likelihoodfunctionforthepineconedata...............242 3.16Actualvs.ttedandresidualsvs.ttedfortheseedlingdata.....247 3.17Diagnosticplotsfortheseedlingdata..................249 3.18Actualmpgandttedvaluesfromthreemodels............251 3.19HappinessQuotientofbankersandpoets................256 4.1The X 1 ;X 2 planeandthe Y 1 ;Y 2 plane................270 4.2pmf's,pdf's,andcdf's..........................272 5.1TheBinomialpmf.............................285 5.2TheNegativeBinomialpmf.......................289 5.3Poissonpmffor =1 ; 4 ; 16 ; 64 ......................295 5.4RutherfordandGeiger'sFigure1....................300 5.5Numbersofringsofaneuronin150msecaftervedifferenttastants. Tastants: 1=MSG.1M;2=MSG.3M;3=NaCl.1M;4=NaCl .3M;5=water. Panels: A:Astripchart.Eachcirclerepresentsone deliveryofatastant.B:Amosaicplot.C:Eachlinerepresentsone tastant.D:Likelihoodfunctions.Eachlinerepresentsonetastant..302 5.6ThelineshowsPoissonprobabilitiesfor =0 : 2 ;thecirclesshowthe fractionoftimestheneuronrespondedwith0,1,...,5spikesfor eachofthevetastants..........................304 5.7Gammadensities.............................307 5.8Exponentialdensities...........................310 5.9Betadensities...............................314 5.10Watertemperatures Cat1000mdepth...............317

PAGE 10

LISTOFFIGURES ix 5.11BivariateNormaldensity.........................323 5.12BivariateNormaldensity.........................326 5.13 t densitiesforfourdegreesoffreedomandtheN ; 1 density....334 6.1Numbersofpineconesin1998asafunctionofdbh..........347 6.2Numbersofpineconesin1999asafunctionofdbh..........348 6.3Numbersofpineconesin2000asafunctionofdbh..........349 6.410,000MCMCsamplesoftheBe ; 2 density. Toppanel :histogram ofsamplesfromtheMetropolis-HastingsalgorithmandtheBe ; 2 density. Middlepanel : i plottedagainst i Bottompanel : p i plottedagainst i ..............................353 6.510,000MCMCsamplesoftheBe ; 2 density. Leftcolumn : j = U )]TJ/F15 11.9552 Tf 10.174 0 Td [(100 ; +100 ; Rightcolumn : j = U )]TJ/F41 11.9552 Tf 10.175 0 Td [(: 00001 ; + : 00001 Top :histogramofsamplesfromtheMetropolis-Hastingsalgorithm andtheBe ; 2 density. Middle : i plottedagainst i Bottom : p i plottedagainst i ..............................356 6.6TraceplotsofMCMCoutputfromthepineconecodeonpage358..360 6.7TraceplotsofMCMCoutputfromthepineconecodewithasmaller proposalradius...............................361 6.8TraceplotsofMCMCoutputfromthepineconecodewithasmaller proposalradiusand100,000iterations.Theplotsshowevery10'th iteration..................................362 6.9TraceplotsofMCMCoutputfromthepineconecodewithproposal function g.one and100,000iterations.Theplotsshowevery10'th iteration..................................364 6.10PairsplotsofMCMCoutputfromthepineconesexample.......365 6.11TraceplotsofMCMCoutputfromthepineconecodewithproposal function g.group and100,000iterations.Theplotsshowevery10'th iteration..................................368 6.12PairsplotsofMCMCoutputfromthepineconesexamplewithproposal g.group ...............................369 6.13Posteriordensityof 2 and 2 fromExample6.2.............370 7.1GraphicalrepresentationofhierarchicalmodelforfMRI.......374 7.2Sometimeseries.............................376 7.3 Y t +1 vs. Y t fortheBeaverandPresidentsdatasets...........378 7.4 Y t + k vs. Y t fortheBeaverdatasetandlags0............379 7.5coplotof Y t +1 Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 j Y t fortheBeaverdataset............381 7.6FitofCO 2 data..............................384

PAGE 11

LISTOFFIGURES x 7.7DAXclosingprices............................385 7.8DAXreturns................................387 7.9Survivalcurveforbladdercancer.Solidlineforthiotepa;dashed lineforplacebo..............................391 7.10Cumulativehazardandloghazardcurvesforbladdercancer.Solid lineforthiotepa;dashedlineforplacebo................394 8.1TheBe : 39 ;: 01 density..........................409 8.2Densitiesof Y in ..............................411 8.3Densitiesof Z in ..............................412

PAGE 12

L ISTOF T ABLES 1.1PartyAfliationandReferendumSupport...............42 1.2SteroidUseandTestResults......................44 2.1NewandOldseedlingsinquadrat6in1992and1993........150 3.1CorrespondencebetweenModels3.3and3.4.............215 3.2 'sforFigure3.14............................240 5.1RutherfordandGeiger'sdata......................299 6.1ThenumbersofpineconesontreesintheFACEexperiment,1998 2000....................................345 xi

PAGE 13

P REFACE Thisbookisintendedasanupperlevelundergraduateorintroductorygraduate textbookinstatisticalthinkingwithalikelihoodemphasisforstudentswithagood knowledgeofcalculusandtheabilitytothinkabstractly.Bystatisticalthinkingis meantafocusonideasthatstatisticianscareaboutasopposedtotechnicaldetails ofhowtoputthoseideasintopractice.Bylikelihoodemphasisismeantthatthe likelihoodfunctionandlikelihoodprincipleareunifyingideasthroughoutthetext. Anotherunusualaspectistheuseofstatisticalsoftwareasapedagogicaltool.That is,insteadofviewingthecomputermerelyasaconvenientandaccuratecalculating device,weusecomputercalculationandsimulationasanotherwayofexplaining andhelpingreadersunderstandtheunderlyingconcepts. Oursoftwareofchoiceis R RDevelopmentCoreTeam[2006]. R andaccompanyingmanualsareavailableforfreedownloadfrom http://www.r-project. org .Youmaywishtodownload AnIntroductiontoR tokeepasareference.It ishighlyrecommendedthatyoutryalltheexamplesin R .Theywillhelpyouunderstandconcepts,giveyoualittleprogrammingexperience,andgiveyoufacility withaveryexiblestatisticalsoftwarepackage.Anddon'tjusttrytheexamples aswritten.Varythemalittle;playaroundwiththem;experiment.Youwon'thurt anythingandyou'lllearnalot. xii

PAGE 14

C HAPTER 1 P ROBABILITY 1.1BasicProbability Let X beasetand F acollectionofsubsetsof X .A probabilitymeasure ,orjusta probability ,on X ; F isafunction : F! [0 ; 1] .Inotherwords,toeverysetin F assignsaprobabilitybetween0and1.Wecall a setfunction becauseitsdomain isacollectionofsets.Butnotjustanysetfunctionwilldo.Tobeaprobability mustsatisfy 1. ; =0 ; istheemptyset., 2. X =1 ,and 3.if A 1 and A 2 aredisjointthen A 1 [ A 2 = A 1 + A 2 Onecanshowthatproperty3holdsforanynitecollectionofdisjointsets,notjust two;seeExercise1.Itiscommonpractice,whichweadoptinthistext,toassume morethatproperty3alsoholdsforanycountablecollectionofdisjointsets. When X isaniteorcountablyinnitesetusuallyintegersthen issaidto bea discrete probability.When X isaninterval,eitherniteorinnite,then issaidtobea continuous probability.Inthediscretecase, F usuallycontainsall possiblesubsetsof X .Butinthecontinuouscase,technicalcomplicationsprohibit F fromcontainingallpossiblesubsetsof X .SeeCasellaandBerger[2002]or Schervish[1995]fordetails.Inthistextwedeemphasizetheroleof F andspeak ofprobabilitymeasureson X withoutmentioning F Inpracticalexamples X isthesetofoutcomesofanexperimentand isdeterminedbyexperience,logicorjudgement.Forexample,considerrollingasix-sided die.Thesetofoutcomesis f 1 ; 2 ; 3 ; 4 ; 5 ; 6 g sowewouldassign Xf 1 ; 2 ; 3 ; 4 ; 5 ; 6 g 1

PAGE 15

1.1.BASICPROBABILITY 2 Ifwebelievethedietobefairthenwewouldalsoassign f 1 g = f 2 g = = f 6 g =1 = 6 .Thelawsofprobabilitythenimplyvariousothervaluessuchas f 1 ; 2 g =1 = 3 f 2 ; 4 ; 6 g =1 = 2 etc. Oftenweomitthebracesandwrite ,etc.Setting i =1 = 6 isnot automaticsimplybecauseadiehassixfaces.Weset i =1 = 6 becausewebelieve thedietobefair. Weusuallyusethewordprobabilityorthesymbol P inplaceof .Forexample,wewouldusethefollowingphrasesinterchangeably: Theprobabilitythatthedielands1 P P[ thedielands1 ] f 1 g Wealsousetheword distribution inplaceof probabilitymeasure Thenextexampleillustrateshowprobabilitiesofcomplicatedeventscanbe calculatedfromprobabilitiesofsimpleevents. Example1.1 TheGameofCraps Craps isagamblinggameplayedwithtwodice.Herearetherules,asexplainedonthe website www.online-craps-gambling.com/craps-rules.html Forthedicethrowershootertheobjectofthegameistothrowa7or an11ontherstrollawinandavoidthrowinga2,3or12aloss.If noneofthesenumbers,3,7,11or12isthrownontherstthrowthe Come-outrollthenaPointisestablishedthepointisthenumberrolled againstwhichtheshooterplays.Theshootercontinuestothrowuntilone oftwonumbersisthrown,thePointnumberoraSeven.Iftheshooterrolls thePointbeforerollingaSevenhe/shewins,howeveriftheshooterthrows aSevenbeforerollingthePointhe/sheloses. Ultimatelywewouldliketocalculate P shooterwins .Butfornow,let'sjustcalculate P shooterwinsonCome-outroll =P 7or11 =P+P :

PAGE 16

1.1.BASICPROBABILITY 3 Usingthelanguageofpage1,whatis X inthiscase?Let d 1 denotethenumbershowing ontherstdieand d 2 denotethenumbershowingontheseconddie. d 1 and d 2 are integersfrom1to6.So X isthesetoforderedpairs d 1 ;d 2 or ; 6 ; 5 ; 4 ; 3 ; 2 ; 1 ; 6 ; 5 ; 4 ; 3 ; 2 ; 1 ; 6 ; 5 ; 4 ; 3 ; 2 ; 1 ; 6 ; 5 ; 4 ; 3 ; 2 ; 1 ; 6 ; 5 ; 4 ; 3 ; 2 ; 1 ; 6 ; 5 ; 4 ; 3 ; 2 ; 1 Ifthedicearefair,thenthepairsareallequallylikely.Sincethereare36ofthem,we assign P d 1 ;d 2 =1 = 36 foranycombination d 1 ;d 2 .Finally,wecancalculate P 7or11 =P ; 5+P ; 6+P ; 1+P ; 2 +P ; 3+P ; 4+P ; 5+P ; 6=8 = 36=2 = 9 : Thepreviouscalculationusesdesideratum3forprobabilitymeasures.Thedierentpairs ; 5 ; 6 ,..., ; 6 aredisjoint,sotheprobabilityoftheirunionisthesumoftheir probabilities. Example1.1illustratesacommonsituation.Weknowtheprobabilitiesofsome simpleeventsliketherollsofindividualdice,andwanttocalculatetheprobabilitiesofmorecomplicatedeventslikethesuccessofaCome-outroll.Sometimes thoseprobabilitiescanbecalculatedmathematicallyasintheexample.Othertimes itismoreconvenienttocalculatethembycomputersimulation.Wefrequentlyuse R tocalculateprobabilities.Toillustrate,Example1.2uses R tocalculatebysimulationthesameprobabilitywefounddirectlyinExample1.1. Example1.2 Craps,continued Tosimulatethegameofcraps,wewillhavetosimulaterollingdice.That'slikerandomly samplinganintegerfrom1to6.The sample commandin R candothat.Forexample, thefollowingsnippetofcodegeneratesonerollfromafair,six-sideddieandshows R 's response: >sample:6,1 [1]1 > Whenyoustart R onyourcomputer,yousee > R 'sprompt.Thenyoucantypeacommand suchas sample:6,1 whichmeanstakeasampleofsize1fromthenumbers1through

PAGE 17

1.1.BASICPROBABILITY 4 6.Itcouldhavebeenabbreviated sample,1 R respondswith [1]1 .The [1] sayshowmanycalculations R hasdone;youcanignoreit.The 1 is R 'sanswertothe sample command;itselectedthenumber.Thenitgaveanother > ,showingthatit's readyforanothercommand.Trythisseveraltimes;youshouldn'tgeteverytime. Here'salongersnippetthatdoessomethingmoreuseful. >x<-sample6,10,replace=T#takeasampleof #size10andcallitx >x#printthetenvalues [1]6423443662 >sumx==3#howmanyareequalto3? [1]2 > Note # isthecommentcharacter.Oneachline, R ignoresalltextafter # Wehavetotell R totakeitssample withreplacement .Otherwise,when R selects thersttime,6isnolongeravailabletobesampledasecondtime.In replace=T ,the T standsfor True
PAGE 18

1.1.BASICPROBABILITY 5 Onaverage,weexpect1/6ofthedrawstoequal1,another1/6toequal2,andsoon. Thefollowingsnippetisaquickdemonstration.Wesimulate6000rollsofadieand expectabout10001's,10002's,etc.Wecounthowmanyweactuallyget.Thissnippet alsointroducesthe for loop,whichyoushouldtrytounderstandnowbecauseitwillbe extremely usefulinthefuture. >x<-sample,6000,replace=T >foriin1:6printsumx==i [1]995 [1]1047 [1]986 [1]1033 [1]975 [1]964 > Eachnumberfrom1through6waschosenabout1000times,plusorminusalittlebit duetochancevariation. Nowlet'sgetbacktocraps.Wewanttosimulatealargenumberofgames,say1000. Foreachgame,werecordeither1or0,accordingtowhethertheshooterwinsonthe Come-outroll,ornot.Weshouldprintoutthenumberofwinsattheend.Sowestart withacodesnippetlikethis: #makeavectoroflength1000,filledwith0's wins<-rep0,1000 foriin1:1000{ simulateaCome-outroll ifshooterwinsonCome-out,wins[i]<-1 } sumwins#printthenumberofwins NowwehavetogureouthowtosimulatetheCome-outrollanddecidewhetherthe shooterwins.Clearly,webeginbysimulatingtherolloftwodice.Sooursnippetexpands to

PAGE 19

1.2.PROBABILITYDENSITIES 6 #makeavectoroflength1000,filledwith0's wins<-rep0,1000 foriin1:1000{ d<-sample1:6,2,replace=T ifsumd==7||sumd==11wins[i]<-1 } sumwins#printthenumberofwins The || standsforor.Sothatlineofcodesets wins[i]<-1 ifthesumofthe rollsiseither7or11.WhenIranthissimulation R printedout 219 .Thecalculationin Example1.1saysweshouldexpectaround = 9 1000 222 wins.Ourcalculation andsimulationagreeaboutaswellascanbeexpectedfromasimulation.Tryityourself afewtimes.Youshouldn'talwaysget219.Butyoushouldgetaround222plusorminus alittlebitduetotherandomnessofthesimulation. Tryoutthese R commandsintheversionof R installedonyourcomputer.Makesure youunderstandthem.Ifyoudon't,printouttheresults.Tryvariations.Tryanytricks youcanthinkoftohelpyoulearn R 1.2ProbabilityDensities Sofarwehavedealtwith discrete probabilities,ortheprobabilitiesofatmosta countablyinnitenumberofoutcomes.Fordiscreteprobabilities, X isusuallya setofintegers,eitherniteorinnite.Section1.2dealswiththecasewhere X is aninterval,eitherofniteorinnitelength.Someexamplesare Medicaltrials thetimeuntilapatientexperiencesarelapse Sports thelengthofajavelinthrow Ecology thelifetimeofatree Manufacturing thediameterofaballbearing Computing theamountoftimeaHelpLinecustomerspendsonhold Physics thetimeuntilauraniumatomdecays Oceanography thetemperatureofoceanwaterataspeciedlatitude,longitude anddepth

PAGE 20

1.2.PROBABILITYDENSITIES 7 Probabilitiesforsuchoutcomesarecalled continuous .Forexample,let Y bethe timeaHelpLinecallerspendsonhold.Therandomvariable Y isoftenmodelled withadensitysimilartothatinFigure1.1. Figure1.1:pdffortimeonholdatHelpLine Thecurveinthegureisa probabilitydensityfunction or pdf .Thepdfislarge near y =0 andmonotonicallydecreasing,expressingtheideathatsmallervalues of y aremorelikelythanlargervalues.Reasonablepeoplemaydisagreeabout whetherthispdfaccuratelyrepresentscallers'experience.Wetypicallyusethe symbols p or f forpdf's.Wewouldwrite p or f todenotethe heightofthecurveat y =50 .Forapdf,probabilityisthesameasareaunderthe curve.Forexample,theprobabilitythatacallerwaitslessthan60minutesis P[ Y< 60]= Z 60 0 p t dt: Everypdfmustsatisfytwoproperties. 1. p y 0 forall y 2. R 1 p y dy =1 .

PAGE 21

1.2.PROBABILITYDENSITIES 8 Therstpropertyholdsbecause,if p y < 0 ontheinterval a;b then P[ Y 2 a;b ]= R b a p y dy< 0 ;andwecan'thaveprobabilitieslessthan0.Thesecond propertyholdsbecause P[ Y 2 ; 1 ]= R 1 p y dy =1 Onepeculiarfactaboutanycontinuousrandomvariable Y isthat P[ Y = a ]=0 forevery a 2 R .That'sbecause P[ Y = a ]=lim 0 P[ Y 2 [ a;a + ]]=lim 0 Z a + a p Y y dy =0 : Consequently,foranynumbers a
PAGE 22

1.2.PROBABILITYDENSITIES 9 Figure1.2: p Y fortheoutcomeofaspinner c,1 collects0and1andputsthemintothevector,1.Likewise, c,1 createsthevector,1. plotx,y,... producesaplot.The plotc,1,c,1,... commandaboveplotsthepoints x[1],y[1] =,1and x[2],y[2] =,1. type="l" saystoplotalineinsteadofindividualpoints. xlab and ylab sayhowtheaxesarelabelled. ylim=c,1.1 setsthelimitsofthey-axisontheplot.If ylim isnotspecied then R setsthelimitsautomatically.Limitsonthex-axiscanbespeciedwith xlim Atothertimesweuseprobabilitydensitiesanddistributionsasmodelsfordata, andestimatethedensitiesanddistributionsdirectlyfromthedata.Figure1.3 showshowthatworks.Theupperpanelofthegureisahistogramof112measurementsofoceantemperatureatadepthof1000metersintheNorthAtlantic near45 Northlatitudeand20 degreesWestlongitude.Example1.5willsay moreaboutthedata.Superimposedonthehistogramisapdf f .Wethinkof f as

PAGE 23

1.2.PROBABILITYDENSITIES 10 underlyingthedata.Theideaisthatmeasuringatemperatureatthatlocationis likerandomlydrawingavaluefrom f .The112measurements,whicharespread outoveraboutacenturyoftime,arelike112independentdrawsfrom f .Having the112measurementsallowsustomakeagoodestimateof f .Ifoceanographers returntothatlocationtomakeadditionalmeasurements,itwouldbelikemakingadditionaldrawsfrom f .Becausewecanestimate f reasonablywell,wecan predictwithsomedegreeofassurancewhatthefuturedrawswillbelike. ThebottompanelofFigure1.3isahistogramofthe discoveries dataset thatcomeswith R andwhichis,as R explains,Thenumbersof`great'inventions andscienticdiscoveriesineachyearfrom1860to1959.Itisoverlaidwitha lineshowingthePoi : 1 distribution.Nameddistributionswillbeintroducedin Section1.3.Itseemsthatthenumberofgreatdiscoverieseachyearfollowsthe Poi : 1 distribution,atleastapproximately.Ifwethinkthefuturewillbelikethe pastthenweshouldexpectfutureyearstofollowasimilarpattern.Again,we thinkofadistributionunderlyingthedata.Thenumberofdiscoveriesinasingle yearislikeadrawfromtheunderlyingdistribution.Thegureshows100years, whichallowustoestimatetheunderlyingdistributionreasonablywell. Figure1.3wasproducedbythefollowingsnippet. parmfrow=c,1 good<-absmed.1000$lon+20<1& absmed.1000$lat-45<1 histmed.1000$temp[good],xlab="temperature",ylab="", main="",prob=T,xlim=c,11 m<-meanmed.1000$temp[good] s<-sqrtvarmed.1000$temp[good] x<-seq5,11,length=40 linesdensitymed.1000$temp[good] histdiscoveries,xlab="discoveries",ylab="",main="", prob=T,breaks=seq-.5,12.5,by=1 lines0:12,dpois:12,3.1,type="b" Note: par sets R 's graphicalparameters mfrow=c,1 tells R tomakeanarrayof m ultiple f iguresina2by1layout.

PAGE 24

1.2.PROBABILITYDENSITIES 11 Figure1.3: a :Oceantemperaturesat1000mdepthnear45 Nlatitude,-20 longitude; b Numbersofimportantdiscoverieseachyear1860

PAGE 25

1.2.PROBABILITYDENSITIES 12 med.1000 isadatasetofNorthAtlanticoceantemperaturesatadepthof 1000meters. med.1000$lon and med.1000$lat arethelongitudeandlatitude ofthemeasurements. med.1000$temp aretheactualtemperatures. abs standsforabsolutevalue. good<-... callsthosepoints good whoselongitudeisbetween-19and-21 andwhoselatitudeisbetween44and46. hist makesahistogram. prob=T turnsthey-axisintoaprobabilityscale areaunderthehistogramis1insteadofcounts. mean calculatesthemean. var calculatesthevariance.Section1.4denesthemeanandvarianceofdistributions.Section2.2.1denesthemean andvarianceofdatasets. lines addslinestoanexistingplot. density estimatesadensityfromadataset. Itisoftennecessarytotransformonevariableintoanotheras,forexample, Z = g X forsomespeciedfunction g .Wemightknow p X Thesubscriptindicates whichrandomvariablewe'retalkingabout.andwanttocalculate p Z .Herewe consideronlymonotonicfunctions g ,sothereisaninverse X = h Z Theorem1.1. Let X bearandomvariablewithpdf p X .Let g beadifferentiable, monotonic,invertiblefunctionanddene Z = g X .Thenthepdfof Z is p Z t = p X g )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 t dg )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t dt Proof. If g isanincreasingfunctionthen p Z b = d db P[ Z 2 a;b ]]= d db P[ X 2 g )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 a ;g )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 b ]] = d db Z g )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 b g )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 a p X x dx = dg )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t dt b p X g )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 b Theproofwhen g isdecreasingisleftasanexercise.

PAGE 26

1.2.PROBABILITYDENSITIES 13 Toillustrate,supposethat X isarandomvariablewithpdf p X x =2 x onthe unitinterval.Let Z =1 =X .Whatis p Z z ?Theinversetransformationis X =1 =Z Itsderivativeis dx=dz = )]TJ/F41 11.9552 Tf 9.299 0 Td [(z )]TJ/F39 7.9701 Tf 6.587 0 Td [(2 .Therefore, p Z z = p X g )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 z dg )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 z dz = 2 z )]TJ/F15 11.9552 Tf 12.919 8.087 Td [(1 z 2 = 2 z 3 Andthepossiblevaluesof Z arefrom1to 1 .So p Z z =2 =z 3 ontheinterval ; 1 .Asapartialcheck,wecanverifythattheintegralis1. Z 1 1 2 z 3 dz = )]TJ/F15 11.9552 Tf 16.107 8.088 Td [(1 z 2 1 1 =1 : Theorem1.1canbeexplainedbyFigure1.4.Thegureshowsan x ,a z ,and thefunction z = g x .Alittleintervalisshownaround x ;callit I x .Itgetsmapped by g intoalittleintervalaround z ;callit I z .Thedensityis p Z z P[ Z 2 I z ] length I z = P[ X 2 I x ] length I x length I x length I z p X x j h 0 z j .3 TheapproximationsinEquation1.3areexactasthelengthsof I x and I z decrease to0. If g isnotone-to-one,thenitisoftenpossibletondsubsetsof R onwhich g is one-to-one,andworkseparatelyoneachsubset. Figure1.4:Changeofvariables

PAGE 27

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 14 1.3ParametricFamiliesofDistributions Probabilitiesoftendependononeormoreunknownnumericalconstants.Suppose, forexample,thatwehaveabiasedcoin.Let bethechancethatitlandsH.Then P H = ;butwemightnotknow ;itisanunknownnumericalconstant.Inthis casewehaveafamilyofprobabilitymeasures,oneforeachvalueof ,andwedon't knowwhichoneisright.Whenweneedtobeexplicitthatprobabilitiesdepend on ,weusethenotation,forexample, P H j or P H j =1 = 3 .Theverticalbar isreadgivenorgiventhat.So P H j =1 = 3 isreadtheprobabilityofHeads giventhat equals1/3and P H j isreadtheprobabilityofHeadsgiven . Thisnotationmeans P H j =1 = 3=1 = 3 ; P T j =1 = 3=2 = 3 ; P T j =1 = 5=4 = 5 andsoon.Insteadofgivenwealsousethewordconditional.Sowewouldsay theprobabilityofHeadsconditionalon ,etc. Theunknownconstant iscalleda parameter .Thesetofpossiblevaluesfor isdenoted uppercase .Foreach thereisaprobabilitymeasure .Theset ofallpossibleprobabilitymeasuresfortheproblemathand, f : 2 g ; iscalleda parametricfamily ofprobabilitymeasures.Therestofthischapterintroducesfourofthemostusefulparametricfamiliesofprobabilitymeasures. 1.3.1TheBinomialDistribution Statisticiansoftenhavetoconsiderobservationsofthefollowingtype. Arepeatableeventresultsineitherasuccessorafailure. Manyrepetitionsareobserved. Successesandfailuresarecounted. Thenumberofsuccesseshelpsuslearnabouttheprobabilityofsuccess. Suchobservationsarecalled binomial .Someexamplesare

PAGE 28

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 15 Medicaltrials Anewtreatmentisgiventomanypatients.Eachiseithercuredor not. Toxicitytests Manylaboratoryanimalsareexposedtoapotentialcarcinogen. Eacheitherdevelopscancerornot. Ecology Manyseedsareplanted.Eacheithergerminatesornot. Qualitycontrol Manysupposedlyidenticalitemsaresubjectedtoatest.Each eitherpassesornot. Becausebinomialexperimentsaresoprevalentthereisspecializedlanguageto describethem.Eachrepetitioniscalleda trial ;thenumberoftrialsisusually denoted N ;theunknownprobabilityofsuccessisusuallydenotedeither p or ; thenumberofsuccessesisusuallydenoted X .Wewrite X Bin N;p .The symbol isread isdistributedas ;wewouldsay XisdistributedasBinomial N p or XhastheBinomialN,pdistribution .Someimportantassumptionsabout binomialexperimentsarethat N isxedinadvance, isthesameforeverytrial, andtheoutcomeofanytrialdoesnotinuencetheoutcomeofanyothertrial. When N =1 wesay X hasaBernoulli distributionandwrite X Bern ;the individualtrialsinabinomialexperimentarecalledBernoullitrials. Whenabinomialexperimentisperformed, X willturnouttobeoneofthe integersfrom0to N .Weneedtoknowtheassociatedprobabilities;i.e. P[ X = k j ] foreachvalueof k from0to N .TheseprobabilitiesaregivenbyEquation1.4 whosederivationisgiveninSection5.1. P[ X = k j ]= N k k )]TJ/F41 11.9552 Tf 11.956 0 Td [( N )]TJ/F42 7.9701 Tf 6.586 0 Td [(k .4 Theterm )]TJ/F42 7.9701 Tf 5.479 -4.379 Td [(N k iscalleda binomialcoefcient andisread N choose k . )]TJ/F42 7.9701 Tf 5.48 -4.379 Td [(N k = N k N )]TJ/F42 7.9701 Tf 6.586 0 Td [(k andisequaltothenumberofsubsetsofsize k thatcanbeformedfromagroupof N distinctitems.Incase k =0 or k = N 0! isdenedtobe1.Figure1.5shows binomialprobabilitiesfor N 2f 3 ; 30 ; 300 g and p 2f : 1 ;: 5 ;: 9 g Example1.3 Craps,continued Thisexamplecontinuesthegameofcraps.SeeExamples1.1and1.2. WhatistheprobabilitythatatleastoneofthenextfourplayerswinsonhisCome-out roll? ThisisaBinomialexperimentbecause 1.Wearelookingatrepeatedtrials.EachCome-outrollisatrial.Itresultsineither success,ornot.

PAGE 29

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 16 Figure1.5:Binomialprobabilities

PAGE 30

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 17 2.Theoutcomeofonetrialdoesnotaecttheothertrials. 3.Wearecountingthenumberofsuccesses. Let X bethenumberofsuccesses.Therearefourtrials,so N =4 .Wecalculatedthe probabilityofsuccessinExample1.1;it's p =2 = 9 .So X Bin ; 2 = 9 .Theprobability ofsuccessinatleastoneCome-outrollis P[ successinatleastoneCome-outroll ]=P[ X 1] = 4 X i =1 P[ X = i ]= 4 X i =1 4 i = 9 i = 9 4 )]TJ/F42 7.9701 Tf 6.587 0 Td [(i 0 : 634 .5 Aconvenientwaytore-expressEquation1.5is P[ X 1]=1 )]TJ/F15 11.9552 Tf 11.956 0 Td [(P[ X =0] ; whichcanbequicklycalculatedin R .The dbinom commandcomputesBinomial probabilities.TocomputeEquation1.5wewouldwrite 1-dbinom,4,2/9 The 0 sayswhatvalueof X wewant.The 4 andthe 2/9 arethenumberoftrials andtheprobabilityofsuccess. Tryit.Learnit. 1.3.2ThePoissonDistribution Anothercommontypeofobservationoccursinthefollowingsituation. Thereisadomainofstudy,usuallyablockofspaceortime. Eventsariseseeminglyatrandominthedomain. Thereisanunderlyingrateatwhicheventsarise. Suchobservationsarecalled Poisson afterthe19thcenturyFrenchmathematician Simon-DenisPoisson.Thenumberofeventsinthedomainofstudyhelpsuslearn abouttherate.Someexamplesare Ecology Treeseedlingsemergefromtheforestoor. Computerprogramming Bugsoccurincomputercode.

PAGE 31

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 18 Qualitycontrol Defectsoccuralongastrandofyarn. Genetics Mutationsoccurinagenome. Trafcow Carsarriveatanintersection. Customerservice Customersarriveataservicecounter. Neurobiology Neuronsre. Therateatwhicheventsoccurisoftencalled ;thenumberofeventsthatoccurin thedomainofstudyisoftencalled X ;wewrite X Poi .Importantassumptions aboutPoissonobservationsarethattwoeventscannotoccuratexactlythesame locationinspaceortime,thattheoccurenceofaneventatlocation ` 1 doesnot inuencewhetheraneventoccursatanyotherlocation ` 2 ,andtherateatwhich eventsarisedoesnotvaryoverthedomainofstudy. WhenaPoissonexperimentisobserved, X willturnouttobeanonnegative integer.TheassociatedprobabilitiesaregivenbyEquation1.6. P[ X = k j ]= k e )]TJ/F42 7.9701 Tf 6.586 0 Td [( k : .6 Oneofthemainthemesofstatisticsisthequantitativewayinwhichdatahelp uslearnaboutthephenomenonwearestudying.Example1.4showshowthis workswhenwewanttolearnabouttherate ofaPoissondistribution. Example1.4 SeedlingsinaForest Treepopulationsmovebydispersingtheirseeds.Seedsbecomeseedlings,seedlingsbecomesaplings,andsaplingsbecomeadultswhicheventuallyproducemoreseeds.Over time,wholepopulationsmaymigrateinresponsetoclimatechange.OneinstanceoccurredattheendoftheIceAgewhenspeciesthathadbeensequesteredinthesouth werefreetomovenorth.Anotherinstancemaybeoccurringtodayinresponsetoglobal warming.Onecriticalfeatureofthemigrationisitsspeed.Someofthefactorsdeterminingthespeedarethetypicaldistancesoflongrangeseeddispersal,theproportion ofseedsthatgerminateandemergefromtheforestoortobecomeseedlings,andthe proportionofseedlingsthatsurviveeachyear. Tolearnaboutemergenceandsurvival,ecologistsreturnannuallytoforestquadrats squaremetersitestocountseedlingsthathaveemergedsincethepreviousyear.One suchstudywasreportedinLavineetal.[2002].Afundamentalquantityofinterestis therate atwhichseedlingsemerge.Supposethat,inonequadrat,threenewseedlings areobserved.Whatdoesthatsayabout ?

PAGE 32

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 19 Dierentvaluesof yielddierentvaluesof P[ X =3 j ] .Tocomparedierentvalues of weseehowwelleachoneexplainsthedata X =3 ;i.e.,wecompare P[ X =3 j ] fordierentvaluesof .Forexample, P[ X =3 j =1]= 1 3 e )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 3! 0 : 06 P[ X =3 j =2]= 2 3 e )]TJ/F39 7.9701 Tf 6.586 0 Td [(2 3! 0 : 18 P[ X =3 j =3]= 3 3 e )]TJ/F39 7.9701 Tf 6.586 0 Td [(3 3! 0 : 22 P[ X =3 j =4]= 4 3 e )]TJ/F39 7.9701 Tf 6.586 0 Td [(4 3! 0 : 14 Inotherwords,thevalue =3 explainsthedataalmostfourtimesaswellasthe value =1 andjustalittlebitbetterthanthevalues =2 and =4 .Figure1.6 shows P[ X =3 j ] plottedasafunctionof .Theguresuggeststhat P[ X =3 j ] is maximizedby =3 .ThesuggestioncanbeveriedbydierentiatingEquation1.6with respecttolambda,equatingto0,andsolving.Thegurealsoshowsthatanyvalueof fromabout0.5toabout9explainsthedatanottoomuchworsethan =3 Figure1.6: P[ X =3 j ] asafunctionof

PAGE 33

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 20 Figure1.6wasproducedbythefollowingsnippet. lam<-seq0,10,length=50 y<-dpois3,lam plotlam,y,xlab="lambda",ylab="P[x=3]",type="l" Note: seq standsforsequence. seq,10,length=50 producesasequenceof50 numbersevenlyspacedfrom0to10. dpois calculatesprobabilitiesforPoissondistributionstheway dbinom does forBinomialdistributions. plot producesaplot.Inthe plot... commandabove, lam goesonthe x-axis, y goesonthey-axis, xlab and ylab sayhowtheaxesarelabelled,and type="l" saystoplotalineinsteadofindvidualpoints. Makingandinterpretingplotsisabigpartofstatistics.Figure1.6isagood example.Justbylookingatthegurewewereabletotellwhichvaluesof are plausibleandwhicharenot.Mostoftheguresinthisbookwereproducedin R 1.3.3TheExponentialDistribution Itisoftennecessarytomodelacontinuousrandomvariable X whosedensitydecreasesawayfrom0.Someexamplesare Customerservice timeonholdatahelpline Neurobiology timeuntilthenextneuronres Seismology timeuntilthenextearthquake Medicine remainingyearsoflifeforacancerpatient Ecology dispersaldistanceofaseed Intheseexamplesitisexpectedthatmostcalls,timesordistanceswillbeshort andafewwillbelong.Sothedensityshouldbelargenear x =0 anddecreasing

PAGE 34

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 21 as x increases.Ausefulpdfforsuchsituationsisthe Exponentialdensity p x = 1 e )]TJ/F43 5.9776 Tf 7.896 3.258 Td [(x for x> 0 : .7 Wesay X hasanexponentialdistributionwithparameter andwrite X Exp Figure1.7showsexponentialdensitiesforseveraldifferentvaluesof Figure1.7:Exponentialdensities Figure1.7wasproducedbythefollowingsnippet. x<-seq0,2,length=40#40valuesfrom0to2 lam<-c2,1,.2,.1#4differentvaluesoflambda y<-matrixNA,40,4#yvaluesforplotting

PAGE 35

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 22 foriin1:4 y[,i]<-dexpx,1/lam[i]#exponentialpdf matplotx,y,type="l",xlab="x",ylab="px",col=1 legend1.2,10,paste"lambda=",lam, lty=1:4,cex=.75 Wewanttoplottheexponentialdensityforseveraldifferentvaluesof so wechoose40valuesof x between0and2atwhichtodotheplotting. Nextwechoose4valuesof Thenweneedtocalculateandsave p x foreachcombinationof x and We'llsavetheminamatrixcalled y matrixNA,40,4 createsthematrix. Thesizeofthematrixis40by4.Itislledwith NA ,or NotAvailable dexp calculatestheexponentialpdf.Theargument x tells R the x valuesat whichtocalculatethepdf. x canbeavector.Theargument 1/lam[i] tells R thevalueoftheparameter. R usesadifferentnotationthanthisbook.Where thisbooksaysExp, R saysExp.5.That'sthereasonforthe 1/lam[i] matplot plotsonematrixversusanother.Therstmatrixis x andthesecond is y matplot plotseachcolumnof y againsteachcolumnof x .Inourcase x isvector,so matplot plotseachcolumnof y ,inturn,against x type="l" says toplotlinesinsteadofpoints. col=1 saystousetherstcolorin R 'slibrary ofcolors. legend... putsalegendontheplot.The 1.2 and 10 arethe x and y coordinatesoftheupperleftcornerofthelegendbox. lty=1:4 saystouse linetypes1through4. cex=.75 setsthe characterexpansionfactor to.75.In otherwords,itsetsthefontsize. paste.. createsthewordsthatgointothelegend.Itpastestogether "lambda="withthefourvaluesof lam 1.3.4TheNormalDistribution Itisoftennecessarytomodelacontinuousrandomvariable Y whosedensityis mound-shaped.Someexamplesare

PAGE 36

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 23 BiologicalAnthropology Heightsofpeople Oceanography Oceantemperaturesataparticularlocation QualityControl Diametersofballbearings Education SATscores Ineachcasetherandomvariableisexpectedtohaveacentralvaluearoundwhich mostoftheobservationscluster.Fewerandfewerobservationsarefartherand fartherawayfromthecenter.Sothepdfshouldbeunimodallargeinthecenter anddecreasinginbothdirectionsawayfromthecenter.Ausefulpdfforsuch situationsisthe Normaldensity p y = 1 p 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 y )]TJ/F43 5.9776 Tf 5.756 0 Td [( 2 : .8 Wesay Y hasaNormaldistributionwithmean andstandarddeviation andwrite Y N ; .Figure1.8showsNormaldensitiesforseveraldifferentvaluesof ; .Asillustratedbythegure, controlsthecenterofthedensity;eachpdfis centeredoveritsownvalueof .Ontheotherhand, controlsthespread.pdf's withlargervaluesof aremorespreadout;pdf'swithsmaller aretighter. Figure1.8wasproducedbythefollowingsnippet. x<-seq-6,6,len=100 y<-cbinddnormx,-2,1, dnormx,0,2, dnormx,0,.5, dnormx,2,.3, dnormx,-.5,3 matplotx,y,type="l",col=1 legend-6,1.3,paste"mu=",c-2,0,0,2,-.5, ";sigma=",c,2,.5,.3,3, lty=1:5,col=1,cex=.75 dnorm... computestheNormalpdf.Therstargumentisthesetof x values;thesecondargumentisthemean;thethirdargumentisthestandard deviation.

PAGE 37

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 24 Figure1.8:Normaldensities

PAGE 38

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 25 Asafurtherillustration,Figure1.9showsahistogramof105oceantemperatures CrecordedintheAtlanticOceanfromabout1938to1997atadepthof 1000meters,near45degreesNorthlatitudeand30degreesWestlongitude.The N : 87 ;: 72 densityissuperimposedonthehistogram.TheNormaldensityrepresentsthedatamoderatelywell.Wewillstudyoceantemperaturesinmuchmore detailinaseriesofexamplesbeginningwithExample1.5. Figure1.9:Oceantemperaturesat 45 N ; 30 W,1000mdepth.TheN : 87 ;: 72 density. Figure1.9wasproducedby histy,prob=T,xlab="temperature",ylab="density", ylim=c,.6,main=""

PAGE 39

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 26 t<-seq4,7.5,length=40 linest,dnormt,meany,sdy The105temperaturesareinavector y hist producesahistogram.Theargument prob=T causestheverticalscaleto beprobabilitydensityinsteadofcounts. Theline t<-... sets40valuesof t intheinterval [4 ; 7 : 5] atwhichto evaluatetheNormaldensityforplottingpurposes. lines displaystheNormaldensity. Asusual,youshouldtrytounderstandthe R commands. Thefunction rnormn,mu,sig generatesarandomsamplefromaNormaldistribution. n isthesamplesize; mu isthemean;and sig isthestandarddeviation. Todemonstrate,we'llgenerateasampleofsize100fromtheN : 87 ;: 72 density, thedensityinFigure1.9,andcomparethesamplehistogramtothetheoretical density.Figure1.10 a showsthecomparison.Itshowsabouthowgoodatcan beexpectedbetweenahistogramandtheNormaldensity,forasampleofsize around100inthemostidealcasewhenthesamplewasactuallygeneratedfrom theNormaldistribution.ItisinterestingtoconsiderwhetherthetinFigure1.9is muchworse. Figure1.10 a wasproducedby samp<-rnorm100,5.87,.72 y.vals<-seq4,7.5,length=40 histsamp,prob=T,main="a", xlim=c,7.5,xlab="degreesC", ylim=c,.6,ylab="density" linesy.vals,dnormy.vals,5.87,.72 WhenworkingwithNormaldistributionsitis extremely usefultothinkinterms ofunitsofstandarddeviation,orsimply standardunits .Onestandardunitequals onestandarddeviation.InFigure1.10 a thenumber6.6isabout1standard

PAGE 40

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 27 Figure1.10: a :Asampleofsize100fromN : 87 ;: 72 andtheN : 87 ;: 72 density. b :Asampleofsize100fromN : 566 ; 1 : 296 andtheN : 566 ; 1 : 296 density. c :Asampleofsize100fromN ; 1 andtheN ; 1 density.

PAGE 41

1.3.PARAMETRICFAMILIESOFDISTRIBUTIONS 28 unitabovethemean,whilethenumber4.5isabout2standardunitsbelowthe mean.Toseewhythat'sausefulwaytothink,Figure1.10 b takesthesample fromFigure1.10 a ,multipliesby9/5andadds32,tosimulatetemperatures measuredin Finsteadof C.Thehistogramsinpanels a and b areslightly differentbecause R haschosenthebinboundariesdifferently;butthetwoNormal curveshaveidenticalshapes.Nowconsidersometemperatures,say 6 : 5 C =43 : 7 F and 4 : 5 C =40 : 1 F.Correspondingtemperaturesoccupycorrespondingpointson theplots.Averticallineat6.5inpanel a dividesthedensityintotwosections exactlycongruenttothetwosectionscreatedbyaverticallineat43.7inpanel b Asimilarstatementholdsfor4.5and40.1.Thepointisthatthetwodensity curveshaveexactlythesameshape.Theyareidenticalexceptforthescaleonthe horizontalaxis,andthatscaleisdeterminedbythestandarddeviation.Standard unitsareascale-freewayofthinkingaboutthepicture. Tocontinue,weconvertedthetemperaturesinpanels a and b tostandard units,andplottedtheminpanel c .Onceagain, R madeaslightlydifferentchoice forthebinboundaries,buttheNormalcurvesallhavethesameshape. Panels b and c ofFigure1.10wereproducedby y2samp<-samp*9/5+32 y2.vals<-y.vals*9/5+32 histy2samp,prob=T,main="b", xlim=c.2,45.5,xlab="degreesF", ylim=c,1/3,ylab="density" linesy2.vals,dnormy2.vals,42.566,1.296 zsamp<-samp-5.87/.72 z.vals<-y.vals-5.87/.72 histzsamp,prob=T,main="c", xlim=c-2.6,2.26,xlab="standardunits", ylim=c,.833,ylab="density" linesz.vals,dnormz.vals,0,1 Let Y N ; anddeneanewrandomvariable Z = Y )]TJ/F41 11.9552 Tf 12.744 0 Td [( = Z isin standardunits.Ittellshowmanystandardunits Y isaboveorbelowitsmean Whatisthedistributionof Z ?Theeasiestwaytondoutistocalculate p Z ,the

PAGE 42

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 29 densityof Z ,andseewhetherwerecognizeit.FromTheorem1.1, p Z z = p Y z + = 1 p 2 e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 z 2 whichwerecognizeastheN ; 1 density.I.e., Z N ; 1 .TheN ; 1 distribution iscalledthe standardNormal distribution. 1.4Centers,Spreads,Means,andMoments RecallFigure1.3pg.11.Ineachpanelthereisahistogramofadatasetalong withanestimateoftheunderlyingpdforpmf p .Ineachcasewehavefounda distributionthatmatchesthedatareasonablywell,butthedistributionswehave drawnarenottheonlyonesthatmatchwell.Wecouldmakemodestchangesto eitherdistributionandstillhaveareasonablygoodmatch.Butwhateverpdfwe proposeforthetoppanelshouldberoughlymoundshapedwithacenteraround 8 andaspreadthatrangesfromabout6 toabout10 .Andinthebottompanel wewouldwantadistributionwithapeakaround2or3andalongishrighthand tail. Ineithercase,thedetailsofthedistributionmatterlessthanthesecentralfeatures.Sostatisticiansoftenneedtorefertothecenter,orlocation,ofasample oradistributionandalsotoitsspread.Section1.4givessomeofthetheoretical underpinningsfortalkingaboutcentersandspreadsofdistributions. Example1.5 Physicaloceanographersstudyphysicalpropertiessuchastemperature,salinity,pressure, oxygenconcentration,andpotentialvorticityoftheworld'soceans.Dataaboutthe oceans'surfacecanbecollectedbysatellites'bouncingsignalsothesurface.But satellitescannotcollectdataaboutdeepoceanwater.Untilasrecentlyasthe1970s,the mainsourceofdataaboutdeepwatercamefromshipsthatlowerinstrumentstovarious depthstorecordpropertiesofoceanwatersuchastemperature,pressure,salinity,etc. Sinceaboutthe1970soceanographershavebeguntoemployneutrallybuoyantoats.A briefdescriptionandhistoryoftheoatscanbefoundonthewebat www.soc.soton. ac.uk/JRD/HYDRO/shb/float.history.html .Figure1.11showslocations,called hydrographicstations ,othecoastofEuropeandAfricawhereship-basedmeasurements weretakenbetweenabout1910and1990.Theoutlineofthecontinentsisapparenton theright-handsideofthegureduetothelackofmeasurementsoverland. Deepoceancurrentscannotbeseenbutcanbeinferredfromphysicalproperties. Figure1.12showstemperaturesrecordedovertimeatadepthof1000metersatnine

PAGE 43

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 30 dierentlocations.TheupperrightpanelinFigure1.12isthesameasthetoppanelof Figure1.3.EachhistograminFigure1.12hasablackcircleindicatingthecenteror locationofthepointsthatmakeupthehistogram.Thesecentersaregoodestimates ofthecentersoftheunderlyingpdf's.Thecentersrangefromalowofabout 5 at latitude45andlongitude-40toahighofabout 9 atlatitude35andlongitude-20.By convention,longitudestothewestofGreenwich,Englandarenegative;longitudestothe eastofGreenwicharepositive.It'sapparentfromthecentersthatforeachlatitude, temperaturestendtogetcolderaswemovefromeasttowest.Foreachlongitude, temperaturesarewarmestatthemiddlelatitudeandcoldertothenorthandsouth. Dataliketheseallowoceanographerstodeducethepresenceofalargeoutpouringof relativelywarmwatercalledthe Mediterraneantongue fromtheMediterraneanSeainto theAtlanticocean.TheMediterraneantongueiscenteredatabout1000metersdepth and 35 Nlatitude,owsfromeasttowest,andiswarmerthanthesurroundingAtlantic watersintowhichitows. Therearemanywaysofdescribingthecenterofadatasample.Butbyfarthe mostcommonisthemean.The mean ofasample,orofanylistofnumbers,isjust theaverage. Denition1.1 Meanofasample The mean ofasample,oranylistofnumbers, x 1 ;:::;x n is meanof x 1 ;:::;x n = 1 n X x i : .9 TheblackcirclesinFigure1.12aremeans.Themeanof x 1 ;:::;x n isoften denoted x .Meansareoftenagoodrststepindescribingdatathatareunimodal androughlysymmetric. Similarly,meansareoftenusefulindescribingdistributions.Forexample,the meanofthepdfintheupperpanelofFigure1.3isabout8.1,thesameasthemean ofthedatainsamepanel.Similarly,inthebottompanel,themeanofthePoi : 1 distributionis3.1,thesameasthemeanofthe discoveries data.Ofcoursewe chosethedistributionstohavemeansthatmatchedthemeansofthedata. Forsomeotherexamples,considertheBin n;p distributionsshowninFigure1.5.ThecenteroftheBin ;: 5 distributionappearstobearound15,the centeroftheBin ;: 9 distributionappearstobearound270,andsoon.The meanofadistribution,orofarandomvariable,isalsocalledthe expectedvalue or expectation andiswritten E X .

PAGE 44

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 31 Figure1.11:hydrographicstationsoffthecoastofEuropeandAfrica

PAGE 45

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 32 Figure1.12:Watertemperatures Cat1000mdepth,latitude25,35,45degrees Northlongitude20,30,40degreesWest

PAGE 46

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 33 Denition1.2 Meanofarandomvariable Let X bearandomvariablewithcdf F X andpdf p X .Thenthe mean of X equivalently,the mean of F X is E X = P i i P[ X = i ] if X isdiscrete R xp X x dx if X iscontinuous .10 Thelogicofthedenitionisthat E X isaweightedaverageofthepossible valuesof X .Eachvalueisweightedbyitsimportance,orprobability.Inaddition to E X ,anothercommonnotationforthemeanofarandomvariable X is X Let'slookatsomeofthefamiliesofprobabilitydistributionsthatwehavealreadystudiedandcalculatetheirexpectations. Binomial If X Bin n;p then E X = n X i =0 i P[ x = i ] = n X i =0 i n i p i )]TJ/F41 11.9552 Tf 11.955 0 Td [(p n )]TJ/F42 7.9701 Tf 6.587 0 Td [(i = n X i =1 i n i p i )]TJ/F41 11.9552 Tf 11.955 0 Td [(p n )]TJ/F42 7.9701 Tf 6.587 0 Td [(i = np n X i =1 n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1! i )]TJ/F15 11.9552 Tf 11.956 0 Td [(1! n )]TJ/F41 11.9552 Tf 11.955 0 Td [(i p i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(p n )]TJ/F42 7.9701 Tf 6.587 0 Td [(i = np n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X j =0 n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1! j n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(j p j )]TJ/F41 11.9552 Tf 11.955 0 Td [(p n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 )]TJ/F42 7.9701 Tf 6.586 0 Td [(j = np .11 Therstveequalitiesarejustalgebra.Thesixthisworthremembering.The sum P n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 j =0 isthesumoftheprobabilitiesoftheBin n )]TJ/F15 11.9552 Tf 11.963 0 Td [(1 ;p distribution. Thereforethesumisequalto1.Youmaywishtocompare E X toFigure1.5. Poisson If X Poi then E X = : ThederivationisleftasExercise18. Exponential If X Exp then E X = Z 1 0 xp x dx = )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 Z 1 0 xe )]TJ/F42 7.9701 Tf 6.586 0 Td [(x= dx = :

PAGE 47

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 34 Useintegrationbyparts. Normal If X N ; then E X = .ThederivationisleftasExercise18. Statisticiansalsoneedtomeasureanddescribethespreadofdistributions,randomvariablesandsamples.InFigure1.12,thespreadwouldmeasurehowmuch variationthereisinoceantemperaturesatasinglelocation,whichinturnwould tellussomethingabouthowheatmovesfromplacetoplaceintheocean.Spread couldalsodescribethevariationintheannualnumbersofgreatdiscoveries,the rangeoftypicaloutcomesforagamblerplayingagamerepeatedlyatacasinoor aninvestorinthestockmarket,ortheuncertaineffectofachangeintheFederal ReserveBank'smonetarypolicy,orevenwhydifferentpatchesofthesameforest havedifferentplantsonthem. Byfarthemostcommonmeasuresofspreadarethe variance anditssquare root,the standarddeviation Denition1.3 Variance The variance ofasample y 1 ;:::;y n is Var y 1 ;:::;y n = n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X y i )]TJ/F15 11.9552 Tf 12.748 0 Td [( y 2 : The variance ofarandomvariable Y is Var Y = E Y )]TJ/F41 11.9552 Tf 11.956 0 Td [( Y 2 Denition1.4 Standarddeviation The standarddeviation ofasample y 1 ;:::;y n is SD y 1 ;:::;y n = q n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X y i )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 : The standarddeviation ofarandomvariable Y is SD Y = p E Y )]TJ/F41 11.9552 Tf 11.955 0 Td [( Y 2 : Thevariancestandarddeviationof Y isoftendenoted 2 Y Y .Thevariances ofcommondistributionswillbederivedlaterinthebook. Caution:forreasonswhichwedon'tgointohere,manybooksdenethevarianceofasampleas Var y 1 ;:::;y n = n )]TJ/F15 11.9552 Tf 12.107 0 Td [(1 )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 P y i )]TJ/F15 11.9552 Tf 12.899 0 Td [( y 2 .Forlarge n thereisno practicaldifferencebetweenthetwodenitions.Andthedenitionofvarianceof arandomvariableremainsunchanged. Whilethedenitionofthevarianceofarandomvariablehighlightsitsinterpretationasdeviationsawayfromthemean,thereisanequivalentformulathatis sometimeseasiertocompute.

PAGE 48

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 35 Theorem1.2. If Y isarandomvariable,then Var Y = E Y 2 )]TJ/F15 11.9552 Tf 11.956 0 Td [( E Y 2 Proof. Var Y = E Y )]TJ/F52 11.9552 Tf 11.955 0 Td [(E Y 2 = E Y 2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 Y E Y + E Y 2 = E Y 2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 E Y 2 + E Y 2 = E Y 2 )]TJ/F15 11.9552 Tf 11.955 0 Td [( E Y 2 Todevelopafeelforwhatthestandarddeviationmeasures,Figure1.14repeats Figure1.12andaddsarrowsshowing 1standarddeviationawayfromthemean. Standarddeviationshavethesameunitsastheoriginalrandomvariable;variances havesquaredunits.E.g.,if Y ismeasuredindegrees,then SD Y isindegreesbut Var Y isindegrees 2 .Becauseofthis, SD iseasiertointerpretgraphically.That's whywewereabletodepict SD 'sinFigure1.14. Mostmound-shapedsamples,thatis,samplesthatareunimodalandroughly symmetric,followthisruleofthumb: about2/3ofthesamplefallswithinabout1standarddeviationofthemean; about95%ofthesamplefallswithinabout2standarddeviationsofthemean. Theruleofthumbhasimplicationsforpredictiveaccuracy.If x 1 ;:::;x n areasamplefromamound-shapeddistribution,thenonewouldpredictthatfutureobservationswillbearound x with,again,about2/3ofthemwithinaboutoneSDand about95%ofthemwithinabouttwoSD's. Toillustratefurther,we'llcalculatetheSDofafewmound-shapedrandom variablesandcomparetheSD'stothepdf's.

PAGE 49

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 36 Binomial Let Y Bin ;: 5 Var Y = E Y 2 )]TJ/F15 11.9552 Tf 11.955 0 Td [( E Y 2 = 30 X y =0 y 2 30 y : 5 30 )]TJ/F15 11.9552 Tf 11.955 0 Td [(15 2 = 30 X y =1 y 30! y )]TJ/F15 11.9552 Tf 11.955 0 Td [(1! )]TJ/F41 11.9552 Tf 11.955 0 Td [(y : 5 30 )]TJ/F15 11.9552 Tf 11.955 0 Td [(15 2 =15 29 X v =0 v +1 29! v )]TJ/F41 11.9552 Tf 11.955 0 Td [(v : 5 29 )]TJ/F15 11.9552 Tf 11.955 0 Td [(15 2 =15 29 X v =0 v 29! v )]TJ/F41 11.9552 Tf 11.955 0 Td [(v : 5 29 + 29 X v =0 29! v )]TJ/F41 11.9552 Tf 11.955 0 Td [(v : 5 29 )]TJ/F15 11.9552 Tf 11.956 0 Td [(15 2 =15 29 2 +1 )]TJ/F15 11.9552 Tf 11.955 0 Td [(15 = 15 2 .12 andtherefore SD Y = p 15 = 2 2 : 7 .SeeExercises19and20. Normal Let Y N ; 1 Var Y = E Y 2 = Z 1 y 2 p 2 e )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(y 2 2 dy =2 Z 1 0 y 2 p 2 e )]TJ/F43 5.9776 Tf 7.782 3.692 Td [(y 2 2 dy =2 Z 1 0 1 p 2 e )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(y 2 2 dy =1 .13 andtherefore SD Y =1 .SeeExercises19and20. Figure1.13showsthecomparison.ThetoppanelshowsthepdfoftheBin ;: 5 distribution;thebottompanelshowstheN ; 1 distribution.

PAGE 50

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 37 Figure1.13:Twopdf'swith 1 and 2 SD's.toppanel:Bin ;: 5 ;bottompanel: N ; 1 .

PAGE 51

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 38 Figure1.13wasproducedbythefollowing R code. parmfrow=c,1 y<-0:30 sd<-sqrt15/2 ploty,dbinomy,30,.5,ylab="py" arrows15-2*sd,0,15+2*sd,0,angle=60,length=.1, code=3,lwd=2 text15,.008,"+/-2SD's" arrows15-sd,.03,15+sd,.03,angle=60,length=.1, code=3,lwd=2 text15,.04,"+/-1SD" y<-seq-3,3,length=60 ploty,dnormy,0,1,ylab="py",type="l" arrows-2,.02,2,.02,angle=60,length=.1,code=3,lwd=2 text0,.04,"+/-2SD's" arrows-1,.15,1,.15,angle=60,length=.1,code=3,lwd=2 text0,.17,"+/-1SD" arrowsx0,y0,x1,y1,length,angle,code,... addsarrowstoa plot.Seethedocumentationforthemeaningofthearguments. text addstexttoaplot.Seethedocumentationforthemeaningofthe arguments. Denition1.5 Moment The r 'thmoment ofasample y 1 ;:::;y n orrandomvariable Y isdenedas n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X y i )]TJ/F15 11.9552 Tf 12.748 0 Td [( y r forsamples E Y )]TJ/F41 11.9552 Tf 11.955 0 Td [( Y r forrandomvariables Variancesaresecondmoments.Momentsabovethesecondhavelittleapplicability. R hasbuilt-infunctionstocomputemeansandvariancesandcancomputeother momentseasily.Notethat R usesthedivisor n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 initsdenitionofvariance.

PAGE 52

1.4.CENTERS,SPREADS,MEANS,ANDMOMENTS 39 #UseRtocalculatemomentsoftheBin,.5distribution x<-rbinom5000,100,.5 m<-meanx#themean v<-varx#thevariance s<-sqrtv#theSD meanx-m^3#thethirdmoment our.v<-meanx-m^2#ourvariance our.s<-sqrtour.v#ourstandarddeviation printcv,our.v#notquiteequal printcs,our.s#notquiteequal rbinom... generatesrandomdrawsfromthebinomialdistribution.The 5000 sayshowmanydrawstogenerate.The 100 and .5 saythatthedraws aretobefromtheBin ;: 5 distribution. Let h beafunction.Then E [ h Y ]= R h y p y dy P h y p y inthediscrete caseistheexpectedvalueof h Y andiscalleda generalizedmoment .Thereare sometimestwowaystoevaluate E [ h Y ] .Oneistoevaluatetheintegral.Theother istolet X = h Y ,nd p X ,andthenevaluate E [ X ]= R xp X x dx .Forexample, let Y havepdf f Y y =1 for y 2 ; 1 ,andlet X = h Y =exp Y Method1 E [ h Y ]= Z 1 0 exp y dy =exp y j 1 0 = e )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 : Method2 p X x = p Y log x dy=dx =1 =x E [ X ]= Z e 1 xp x x dx = Z e 1 1 dx = e )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 If h isalinearfunctionthen E [ h Y ] hasaparticularlyappealingform. Theorem1.3. If X = a + bY then E [ X ]= a + b E [ Y ] Proof. Weprovethecontinuouscase;thediscretecaseisleftasanexercise. E X = Z a + by f Y y dy = a Z f Y y dy + b Z yf y y dy = a + b E Y

PAGE 53

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 40 Thereisacorrespondingtheoremforvariance. Theorem1.4. If X = a + bY then Var X = b 2 Var Y Proof. Weprovethecontinuouscase;thediscretecaseisleftasanexercise.Let = E [ Y ] Var X = E [ a + bY )]TJ/F15 11.9552 Tf 11.955 0 Td [( a + b 2 ] = E [ b 2 Y )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 ] = b 2 Var Y 1.5Joint,MarginalandConditionalProbability Statisticiansoftenhavetodealsimultaneouslywiththeprobabilitiesofseveral events,quantities,orrandomvariables.Forexample,wemayclassifyvotersina cityaccordingtopoliticalpartyafliationandsupportforaschoolbondreferendum.Let A and S beavoter'safliationandsupport,respectively. A = D ifDemocrat R ifRepublican : and S = Y ifinfavor N ifopposed Supposeapollingorganizationndsthat80%ofDemocratsand35%ofRepublicansfavorthebondreferendum.The80%and35%arecalled conditional probabilitiesbecausetheyareconditionalonpartyafliation.Thenotationforconditional probabilitiesis p S j A .Asusual,thesubscriptindicateswhichrandomvariables we'retalkingabout.Specically, p S j A Y j D =0 : 80; p S j A N j D =0 : 20; p S j A Y j R =0 : 35; p S j A N j R =0 : 65 : Wesaytheconditionalprobabilitythat S = N given A = D is0.20,etc. Supposefurtherthat60%ofvotersinthecityareDemocrats.Then80%of 60%=48%ofthevotersareDemocratswhofavorthereferendum.The48%is calleda joint probabilitybecauseitistheprobabilityof A = D;S = Y jointly. Thenotationis p A;S D;Y = : 48 .Likewise, p A;S D;N = : 12 ; p A;S R;Y = : 14 ; and p A;S R;N =0 : 26 .Table1.1summarizesthecalculations.Thequantities

PAGE 54

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 41 Figure1.14:Watertemperatures Cat1000mdepth,latitude25,35,45degrees North,longitude20,30,40degreesWest,withstandarddeviations

PAGE 55

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 42 ForAgainst Democrat 48%12% 60% Republican 14%26% 40% 62%38% Table1.1:PartyAfliationandReferendumSupport .60,.40,.62,and.38arecalled marginal probabilities.Thenamederivesfrom historicalreasons,becausetheywerewritteninthemarginsofthetable.Marginal probabilitiesareprobabilitiesforonevariablealone,theordinaryprobabilitiesthat we'vebeentalkingaboutallalong. Theevent A = D canbepartitionedintothetwosmallerevents A = D;S = Y and A = D;S = N .So p A D = : 60= : 48+ : 12= p A;S D;Y + p A;S D;N : Theevent A = R canbepartitionedsimilarly.Too,theevent S = Y canbe partitionedinto A = D;S = Y and A = R;S = Y .So p S Y = : 62= : 48+ : 14= p A;S D;Y + p A;S R;Y : Thesecalculationsillustrateageneralprinciple: Togetamarginalprobability foronevariable,addthejointprobabilitiesforallvaluesoftheothervariable. The generalformulaeforworkingsimultaneouslywithtwodiscreterandomvariables X and Y are f X;Y x;y = f X x f Y j X y j x = f Y y f X j Y x j y .14 f X x = X y f X;Y x;y f Y y = X x f X;Y x;y Sometimesweknowjointprobabilitiesandneedtondmarginalsandconditionals;sometimesit'stheotherwayaround.Andsometimesweknow f X and f Y j X andneedtond f Y or f X j Y .Thefollowingstoryisanexampleofthelatter.Itisa commonproblemindrugtesting,diseasescreening,polygraphtesting,andmany otherelds. Theparticipantsinanathleticcompetitionaretoberandomlytestedforsteroid use.Thetestis90%accurateinthefollowingsense:forathleteswhousesteroids, thetesthasa90%chanceofreturningapositiveresult;fornon-users,thetesthas a10%chanceofreturningapositiveresult.Supposethatonly30%ofathletesuse

PAGE 56

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 43 steroids.Anathleteisrandomlyselected.Hertestreturnsapositiveresult.What istheprobabilitythatsheisasteroiduser? Thisisaproblemoftworandomvariables, U ,thesteroiduseoftheathleteand T ,thetestresultoftheathlete.Let U =1 iftheathleteusessteroids; U =0 ifnot. Let T =1 ifthetestresultispositive; T =0 ifnot.Wewant f U j T j 1 .Wecan calculate f U j T ifweknow f U;T ;andwecancalculate f U;T becauseweknow f U and f T j U .Pictorially, f U ;f T j U )167(! f U;T )167(! f U j T Thecalculationsare f U;T ; 0= : 7 : 9= : 63 f U;T ; 1= : 7 : 1= : 07 f U;T ; 0= : 3 : 1= : 03 f U;T ; 1= : 3 : 9= : 27 so f T = : 63+ : 03= : 66 f T = : 07+ : 27= : 34 andnally f U j T j 1= f U;T ; 1 =f T = : 27 =: 34 : 80 : Inotherwords,eventhoughthetestis90%accurate,theathletehasonlyan80% chanceofusingsteroids.Ifthatdoesn'tseemintuitivelyreasonable,thinkofa largenumberofathletes,say100.About30willbesteroidusersofwhomabout 27willtestpositive.About70willbenon-usersofwhomabout7willtestpositive. Sotherewillbeabout34athleteswhotestpositive,ofwhomabout27,or80% willbeusers. Table1.2isanotherrepresentationofthesameproblem.Itisimportantto becomefamiliarwiththeconceptsandnotationintermsofmarginal,conditional andjointdistributions,andnottorelytooheavilyonthetabularrepresentation becauseinmorecomplicatedproblemsthereisnoconvenienttabularrepresentation. Example1.6isafurtherillustrationofjoint,conditional,andmarginaldistributions. Example1.6 Seedlings Example1.4introducedanobservationalexperimenttolearnabouttherateofseedling productionandsurvivalattheCoweetaLongTermEcologicalResearchstationinwestern NorthCarolina.Foraparticularquadratinaparticularyear,let N bethenumberof

PAGE 57

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 44 T =0 T =1 U =0 .63.07 .70 U =1 .03.27 .30 .66.34 Table1.2:SteroidUseandTestResults newseedlingsthatemerge.Supposethat N Poi forsome > 0 .Eachseedling eitherdiesoverthewinterorsurvivestobecomeanoldseedlingthenextyear.Let betheprobabilityofsurvivaland X bethenumberofseedlingsthatsurvive.Suppose thatthesurvivalofanyoneseedlingisnotaectedbythesurvivalofanyotherseedling. Then X Bin N; .Figure1.15showsthepossiblevaluesofthepair N;X .The probabilitiesassociatedwitheachofthepointsinFigure1.15aredenoted f N;X where,as usual,thesubscriptindicateswhichvariableswe'retalkingabout.Forexample, f N;X ; 2 istheprobabilitythat N =3 and X =2 Figure1.15:Permissiblevaluesof N and X ,thenumberofnewseedlingsandthe numberthatsurvive. Thenextstepistogureoutwhatthejointprobabilitiesare.Consider,forexample, theevent N =3 .Thateventcanbepartitionedintothefoursmallerevents N =

PAGE 58

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 45 3 ;X =0 N =3 ;X =1 N =3 ;X =2 ,and N =3 ;X =3 .So f N = f N;X ; 0+ f N;X ; 1+ f N;X ; 2+ f N;X ; 3 ThePoissonmodelfor N says f N =P[ N =3]= e )]TJ/F42 7.9701 Tf 6.586 0 Td [( 3 = 6 .Buthowisthetotal e )]TJ/F42 7.9701 Tf 6.587 0 Td [( 3 = 6 dividedintothefourparts?That'swheretheBinomialmodelfor X comesin. ThedivisionismadeaccordingtotheBinomialprobabilities 3 0 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 3 3 1 )]TJ/F41 11.9552 Tf 11.956 0 Td [( 2 3 2 2 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 3 3 3 The e )]TJ/F42 7.9701 Tf 6.586 0 Td [( 3 = 6 isamarginalprobabilitylikethe60%inthealiation/supportproblem. Thebinomialprobabilitiesaboveare conditional probabilitieslikethe80%and20%;they areconditionalon N =3 .Thenotationis f X j N j 3 or P[ X =2 j N =3] .Thejoint probabilitiesare f N;X ; 0= e )]TJ/F42 7.9701 Tf 6.587 0 Td [( 3 6 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 3 f N;X ; 1= e )]TJ/F42 7.9701 Tf 6.586 0 Td [( 3 6 3 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 f N;X ; 2= e )]TJ/F42 7.9701 Tf 6.586 0 Td [( 3 6 3 2 )]TJ/F41 11.9552 Tf 11.955 0 Td [( f N;X ; 3= e )]TJ/F42 7.9701 Tf 6.586 0 Td [( 3 6 3 Ingeneral, f N;X n;x = f N n f X j N x j n = e )]TJ/F42 7.9701 Tf 6.587 0 Td [( n n n x x )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F42 7.9701 Tf 6.586 0 Td [(x Anecologistmightbeinterestedin f X ,thepdfforthenumberofseedlingsthatwill berecruitedintothepopulationinaparticularyear.Foraparticularnumber x f X x islikelookinginFigure1.15alongthehorizontallinecorrespondingto X = x .Toget f X x P[ X = x ] ,wemustaddupalltheprobabilitiesonthatline. f X x = X n f N;X n;x = 1 X n = x e )]TJ/F42 7.9701 Tf 6.586 0 Td [( n n n x x )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F42 7.9701 Tf 6.587 0 Td [(x = 1 X n = x e )]TJ/F42 7.9701 Tf 6.587 0 Td [( )]TJ/F42 7.9701 Tf 6.587 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F42 7.9701 Tf 6.587 0 Td [(x n )]TJ/F41 11.9552 Tf 11.956 0 Td [(x e )]TJ/F42 7.9701 Tf 6.587 0 Td [( x x = e )]TJ/F42 7.9701 Tf 6.587 0 Td [( x x 1 X z =0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [( )]TJ/F42 7.9701 Tf 6.586 0 Td [( )]TJ/F41 11.9552 Tf 11.956 0 Td [( z z = e )]TJ/F42 7.9701 Tf 6.587 0 Td [( x x !

PAGE 59

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 46 Thelastequalityfollowssince P z =1 becauseitisthesumofprobabilitiesfrom thePoi )]TJ/F41 11.9552 Tf 12.413 0 Td [( distribution.Thenalresultisrecognizedasaprobabilityfromthe Poi distributionwhere = .So X Poi Inthederivationweusedthesubstitution z = n )]TJ/F41 11.9552 Tf 10.405 0 Td [(x .Thetrickisworthremembering. Forcontinuousrandomvariables,conditionalandjointdensitiesarewritten p X j Y x j y and p X;Y x;y respectivelyand,analgouslytoEquation1.14wehave p X;Y x;y = p X x p Y j X y j x = p Y y p X j Y x j y .15 p X x = Z 1 p X;Y x;y dyp Y y = Z 1 p X;Y x;y dx Thelogicisthesameasfordiscreterandomvariables.Inorderfor X = x;Y = y tooccurweneedeitherofthefollowing. 1.First X = x occursthen Y = y occurs.Theprobabilityofthathappening isjust p X x p Y j X y j x ,theprobabilitythat X = x occurstimesthe probabilitythat Y = y occursundertheconditionthat X = x hasalready occured.Probabilityisinquotesbecause,forcontinuousrandomvariables, theprobabilityis0.Butprobabilityisausefulwaytothinkintuitively. 2.First Y = y occursthen X = x occurs.Thereasoningissimilartothatin item1. Justasforsinglerandomvariables,probabilitiesareintegralsofthedensity. If A isaregioninthe x;y plane, P[ X;Y 2 A ]= R A p x;y dxdy ,where R A ::: indicatesadoubleintegralovertheregion A Justasfordiscreterandomvariables,theunconditionaldensityofarandom variableiscalledits marginal density; p X and p Y aremarginaldensities.Let B R beaset.Sinceadensityisthefunctionthatmustbeintegratedtocalculatea probability,ononehand, P[ X 2 B ]= R B p X x dx .Ontheotherhand, P[ X 2 B ]=P[ X;Y 2 B R ]= Z B Z R p X;Y x;y dy dx whichimplies p X x = R R p X;Y x;y dy Anexamplewillhelpillustrate.AcustomercallsthecomputerHelpLine.Let X betheamountoftimehespendsonholdand Y bethetotaldurationofthecall. Theamountoftimeaconsultantspendswithhimafterhiscallisansweredis W = Y )]TJ/F41 11.9552 Tf 11.955 0 Td [(X .Supposethejointdensityis p X;Y x;y = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y intheregion 0
PAGE 60

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 47 1. Whatisthemarginaldensityof X ? p x = Z p x;y dy = Z 1 x e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y dy = )]TJ/F41 11.9552 Tf 9.299 0 Td [(e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y 1 x = e )]TJ/F42 7.9701 Tf 6.587 0 Td [(x 2. Whatisthemarginaldensityof Y ? p y = Z p x;y dx = Z y 0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y dx = ye )]TJ/F42 7.9701 Tf 6.587 0 Td [(y 3. Whatistheconditionaldensityof X given Y ? p x j y = p x;y p y = y )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 4. Whatistheconditionaldensityof Y given X ? p y j x = p x;y p x = e x )]TJ/F42 7.9701 Tf 6.586 0 Td [(y 5. Whatisthemarginaldensityof W ? p w = d dw P[ W w ]= d dw Z 1 0 Z x + w x e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y dydx = d dw Z 1 0 )]TJ/F41 11.9552 Tf 9.298 0 Td [(e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y x + w x dx = d dw Z 1 0 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x )]TJ/F41 11.9552 Tf 11.955 0 Td [(e )]TJ/F42 7.9701 Tf 6.586 0 Td [(w dx = e )]TJ/F42 7.9701 Tf 6.587 0 Td [(w Figure1.16illustratestheHelpLinecalculations.Forquestions1and2,the answercomesfromusingEquations1.15.Theonlypartdeservingcommentisthe limitsofintegration.Inquestion1,forexample,foranyparticularvalue X = x Y rangesfrom x to 1 ,ascanbeseenfrompanelaofthegure.That'swhere thelimitsofintegrationcomefrom.Inquestion2,foranyparticular y X 2 ;y whicharethelimitsofintegration.Paneldshowstheconditionaldensityof X given Y forthreedifferentvaluesof Y .Weseethatthedensityof X isuniformon theinterval ;y .SeeSection5.4fordiscussionofthisdensity.Paneldshows theconditionaldensityof Y given X forthreedifferentvaluesof X .Itshows, rst,that Y>X andsecond,thatthedensityof Y decaysexponentially.See

PAGE 61

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 48 Figure1.16: a :theregionof R 2 where X;Y live; b :themarginaldensityof X ; c :themarginaldensityof Y ; d :theconditionaldensityof X given Y for threevaluesof Y ; e :theconditionaldensityof Y given X forthreevaluesof X ; f :theregion W w

PAGE 62

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 49 Section1.3.3fordiscussionofthisdensity.Panelfshowstheregionofintegration forquestion5.Takethetimetounderstandthemethodbeingusedtoanswer question5. Whendealingwitharandomvariable X ,sometimesitspdfisgiventousand wecancalculateitsexpectation: E X = Z xp x dx: Theintegralisreplacedbyasumif X isdiscrete.Othertimes X arisesmore naturallyaspartofapair X;Y anditsexpectationis E X = ZZ xp x;y dxdy: Thetwoformulaeare,ofcourse,equivalent.Butwhen X doesariseaspartofa pair,thereisstillanotherwaytoview p x and E X : p X x = Z )]TJ/F41 11.9552 Tf 5.48 -9.684 Td [(p X j Y x j y p Y y dy = E )]TJ/F41 11.9552 Tf 5.48 -9.684 Td [(p X j Y x j y .16 E X = Z Z xp X j Y x j y dx p Y y dy = E E X j Y : .17 Thenotationdeservessomeexplanation.Foranynumber x p X j Y x j y isafunctionof y ,say g y .ThemiddleterminEquation1.16is R g y p y dy ,whichequals E g y ,whichistherighthandterm.Similarly, E X j Y isafunctionof Y ,say h Y .ThemiddleterminEquation1.17is R h y p y dy ,whichequals E h y whichistherighthandterm. Example1.7 Seedlings,continued Examples1.4and1.6discussed N;X ,thenumberofnewseedlingsinaforestquadrat andthenumberofthosethatsurvivedoverthewinter.Example1.6suggestedthe statisticalmodel N Poi and X j N Bin N; : Equation1.17showsthat E X canbecomputedas E X = E E X j N = E N = E N = : Example1.8 Craps,continued Examples1.1,1.2and1.3introducedthegameofcraps.Example1.8calculatesthe chanceofwinning. Let X = 0 ifshooterloses 1 ifshooterwins

PAGE 63

1.5.JOINT,MARGINALANDCONDITIONALPROBABILITY 50 X hasaBernoullidistribution.Wearetryingtond P[ shooterwins ]= p X = E X : Makesureyouseewhy p X = E X .Let Y betheoutcomeoftheCome-outroll. Equation1.17says E X = E E X j Y = E X j Y =2P[ Y =2]+ E X j Y =3P[ Y =3] + E X j Y =4P[ Y =4]+ E X j Y =5P[ Y =5] + E X j Y =6P[ Y =6]+ E X j Y =7P[ Y =7] + E X j Y =8P[ Y =8]+ E X j Y =9P[ Y =9] + E X j Y =10P[ Y =10]+ E X j Y =11P[ Y =11] + E X j Y =12P[ Y =12] =0 1 36 +0 2 36 + E X j Y =4 3 36 + E X j Y =5 4 36 + E X j Y =6 5 36 +1 6 36 + E X j Y =8 5 36 + E X j Y =9 4 36 + E X j Y =10 3 36 +1 2 36 +0 1 36 : Soitonlyremainstond E X j Y = y for y =4 ; 5 ; 6 ; 8 ; 9 ; 10 .Thecalculationsareall similar.Wewilldooneofthemtoillustrate.Let w = E X j Y =5 andlet z denote thenextrollofthedice.Once5hasbeenestablishedasthepoint,thenarollofthe dicehasthreepossibleoutcomes: win if z =5 lose if z =7 ,or rollagain if z is anythingelse.Therefore w =1 4 = 36+0 6 = 36+ w 26 = 36 = 36 w =4 = 36 w =4 = 10 : Aftersimilarcalculationsfortheotherpossiblepointswend E X = 3 9 3 36 + 4 10 4 36 + 5 11 5 36 + 6 36 + 5 11 5 36 + 4 10 4 36 + 3 9 3 36 + 2 36 : 493 : Crapsisaveryfairgame;thehousehasonlyaslightedge.

PAGE 64

1.6.ASSOCIATION,DEPENDENCE,INDEPENDENCE 51 1.6Association,Dependence,Independence Itisoftenusefultodescribeormeasurethedegreeofassociationbetweentworandomvariables X and Y .The R dataset iris providesagoodexample.Itcontains thelengthsandwidthsofsetalsandpetalsof150irisplants.Therstseverallines of iris are Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies 15.13.51.40.2setosa 24.93.01.40.2setosa 34.73.21.30.2setosa Figure1.17showseachvariableplottedagainsteveryothervariable.Itisevident fromtheplotthatpetallengthandpetalwidthareverycloselyassociatedwith eachother,whiletherelationshipbetweensepallengthandsepalwidthismuch weaker.Statisticiansneedawaytoquantifythestrengthofsuchrelationships. Figure1.17wasproducedbythefollowinglineof R code. pairsiris[,1:4] pairs producesa pairsplot ,amatrixofscatterplotsofeachpairofvariables.Thenamesofthevariablesareshownalongthemaindiagonalofthe matrix.The i;j thplotinthematrixisaplotofvariable i versusvariable j .Forexample,theupperrightplothassepallengthontheverticalaxisand petalwidthonthehorizontalaxis. Byfarthemostcommonmeasuresofassociationare covariance and correlation Denition1.6. The covariance of X and Y is Cov X;Y E X )]TJ/F41 11.9552 Tf 11.955 0 Td [( X Y )]TJ/F41 11.9552 Tf 11.955 0 Td [( Y In R cov measuresthecovarianceinasample.Thus, coviris[,1:4] producesthematrix Sepal.LengthSepal.WidthPetal.LengthPetal.Width Sepal.Length0.68569351-0.042434001.27431540.5162707 Sepal.Width-0.042434000.18997942-0.3296564-0.1216394 Petal.Length1.27431544-0.329656383.11627791.2956094 Petal.Width0.51627069-0.121639371.29560940.5810063

PAGE 65

1.6.ASSOCIATION,DEPENDENCE,INDEPENDENCE 52 Figure1.17:Lengthsandwidthsofsepalsandpetalsof150irisplants

PAGE 66

1.6.ASSOCIATION,DEPENDENCE,INDEPENDENCE 53 inwhichthediagonalentriesarevariancesandtheoff-diagonalentriesarecovariances. Themeasurementsin iris areincentimeters.Tochangetomillimeterswe wouldmultiplyeachmeasurementby10.Here'showthataffectsthecovariances. >cov10*iris[,1:4] Sepal.LengthSepal.WidthPetal.LengthPetal.Width Sepal.Length68.569351-4.243400127.4315451.62707 Sepal.Width-4.24340018.997942-32.96564-12.16394 Petal.Length127.431544-32.965638311.62779129.56094 Petal.Width51.627069-12.163937129.5609458.10063 Eachcovariancehasbeenmultipliedby100becauseeachvariablehasbeenmultipliedby10.Infact,thisrescalingisaspecialcaseofthefollowingtheorem. Theorem1.5. Let X and Y berandomvariables.Then Cov aX + b;cY + d = ac Cov X;Y Proof. Cov aX + b;cY + d = E aX + b )]TJ/F15 11.9552 Tf 11.955 0 Td [( a X + b cY + d )]TJ/F15 11.9552 Tf 11.955 0 Td [( c Y + d = E ac X )]TJ/F41 11.9552 Tf 11.955 0 Td [( X Y )]TJ/F41 11.9552 Tf 11.955 0 Td [( Y = ac Cov X;Y Theorem1.5showsthat Cov X;Y dependsonthescalesinwhich X and Y are measured.Ascale-freemeasureofassociationwouldalsobeuseful.Correlationis themostcommonsuchmeasure. Denition1.7. The correlation between X and Y is Cor X;Y Cov X;Y SD X SD Y cor measurescorrelation.Thecorrelationsin iris are >coriris[,1:4] Sepal.LengthSepal.WidthPetal.LengthPetal.Width Sepal.Length1.0000000-0.11756980.87175380.8179411 Sepal.Width-0.11756981.0000000-0.4284401-0.3661259 Petal.Length0.8717538-0.42844011.00000000.9628654 Petal.Width0.8179411-0.36612590.96286541.0000000

PAGE 67

1.6.ASSOCIATION,DEPENDENCE,INDEPENDENCE 54 whichconrmsthevisuallyimpressionthatsepallength,petallength,andpetal widtharehighlyassociatedwitheachother,butareonlylooselyassociatedwith sepalwidth. Theorem1.6tellsusthatcorrelationisunaffectedbylinearchangesinmeasurementscale. Theorem1.6. Let X and Y berandomvariables.Then Cor aX + b;cY + d = Cor X;Y Proof. SeeExercise40. Correlationdoesn'tmeasurealltypesofassociation;itonlymeasuresclustering aroundastraightline.ThersttwocolumnsofFigure1.18showdatasetsthat clusteraroundaline,butwithsomescatteraboveandbelowtheline.Thesedata setsareallwelldescribedbytheircorrelations,whichmeasuretheextentofthe clustering;thehigherthecorrelation,thetighterthepointsclusteraroundtheline andthelesstheyscatter.Negativevaluesofthecorrelationcorrespondtolines withnegativeslopes.Thelastcolumnofthegureshowssomeothersituations. Therstpanelofthelastcolumnisbestdescribedashavingtwoisolatedclusters ofpoints.Despitethecorrelationof.96,thepaneldoesnotlookatalllikethe lastpanelofthesecondcolumn.Thesecondandthirdpanelsofthelastcolumn showdatasetsthatfollowsomenonlinearpatternofassociation.Again,their correlationsaremisleading.Finally,thelastpanelofthelastcolumnshowsadata setinwhichmostofthepointsaretightlyclusteredaroundalinebutinwhich therearetwooutliers.Thelastcolumndemonstratesthatcorrelationsarenot gooddescriptorsofnonlineardatasetsordatasetswithoutliers. Correlationmeasureslinearassociationbetweenrandomvariables.Butsometimeswewanttosaywhethertworandomvariableshaveanyassociationatall, notjustlinear. Denition1.8. Twotworandomvariables, X and Y ,aresaidtobe independent if p X j Y = p X ,forallvaluesof Y .If X and Y arenotindependentthentheyare saidtobe dependent If X and Y areindependentthenitisalsotruethat p Y j X = p Y .The interpretationisthatknowingoneoftherandomvariablesdoesnotchangethe probabilitydistributionoftheother.If X and Y areindependentdependentwe write X ? Y X 6? Y .If X and Y areindependentthen Cov X;Y =Cor X;Y = 0 .Theconverseisnottrue.Also,if X ? Y then p x;y = p x p y .Thislast equalityisusuallytakentobethedenitionofindependence.

PAGE 68

1.6.ASSOCIATION,DEPENDENCE,INDEPENDENCE 55 Figure1.18:correlations

PAGE 69

1.6.ASSOCIATION,DEPENDENCE,INDEPENDENCE 56 Donotconfuseindependentwithmutuallyexclusive.Let X denotetheoutcome ofadierollandlet A =1 if X 2f 1 ; 2 ; 3 g and A =0 if X 2f 4 ; 5 ; 6 g A iscalledan indicator variablebecauseitindicatestheoccurenceofaparticularevent.Thereis aspecialnotationforindicatorvariables: A = 1 f 1 ; 2 ; 3 g X : 1 f 1 ; 2 ; 3 g isan indicatorfunction 1 f 1 ; 2 ; 3 g X iseither1or0accordingtowhether X isinthesubscript.Let B = 1 f 4 ; 5 ; 6 g X C = 1 f 1 ; 3 ; 5 g X D = 1 f 2 ; 4 ; 6 g X and E = 1 f 1 ; 2 ; 3 ; 4 g X A and B aredependentbecause P[ A ]= : 5 but P[ A j B ]=0 D and E areindependentbecause P[ D ]=P[ D j E ]= : 5 .Youcanalsocheckthat P[ E ]=P[ E j D ]=2 = 3 .Donotconfusedependencewithcausality. A and B are dependent,butneithercausestheother. Foranexample,recalltheHelpLinestoryonpage46. X and Y werethe amountoftimeonholdandthetotallengthofthecall,respectively.Thedifference was W = Y )]TJ/F41 11.9552 Tf 12.123 0 Td [(X .Wefound p x j y = y )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 .Because p x j y dependson y X 6? Y Similarly, p y j x dependson x .Doesthatmakesense?Wouldknowingsomething about X tellusanythingabout Y ? Whatabout X and W ?Aretheyindependent? p w j x = d dw P[ W w j X = x ]= d dw P[ Y w + x j X = x ]= e )]TJ/F42 7.9701 Tf 6.586 0 Td [(w whichdoesnotdependon x .Therefore X ? W .Doesthatmakesense?Would knowingsomethingabout X tellusanythingabout W ? Example1.9 Seedlings,continued Examples1.4,1.6,and1.7wereaboutnewseedlingsinforestquadrats.Supposethat ecologistsobservethenumberofnewseedlingsinaquadratfor k successiveyears;call theobservations N 1 ,..., N k .Iftheseedlingarrivalrateisthesameeveryyear,thenwe couldadoptthemodel N i Poi .I.e., isthesameforeveryyear.If isknown, orifweconditionon ,thenthenumberofnewseedlingsinoneyeartellsusnothing aboutthenumberofnewseedlingsinanotheryear,wewouldmodelthe N i 'sasbeing independent,andtheirjointdensity,conditionalon ,wouldbe p n 1 ;:::;n k j = k Y i =1 p n i j = k Y i =1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [( n i n i = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(k P n i Q n i Butif isunknownthenwemighttreatitlikearandomvariable.Itwouldbearandom variableif,forinstance,dierentlocationsintheforesthaddierent 'swechosea

PAGE 70

1.7.SIMULATION 57 locationatrandomorifwethoughtofNatureasrandomlychoosing forourparticular location.Inthatcasethedatafromearlyyears, N 1 ;:::;N m ,say,yieldinformationabout andthereforeaboutlikelyvaluesof N m +1 ;:::;N k ,sothe N i 'saredependent.Infact, p n 1 ;:::;n k = Z p n 1 ;:::;n k j p d = Z e )]TJ/F42 7.9701 Tf 6.587 0 Td [(k P n i Q n i p d So,whetherthe N i 'sareindependentisnotaquestionwithasinglerightanswer.Instead, itdependsonourperspective.Butineithercase,wewouldsaythe N i 'sareconditionally independentgiven 1.7Simulation Wehavealreadyseen,inExample1.2,anexampleofcomputersimulationtoestimateaprobability.Morebroadly,simulationcanbehelpfulinseveraltypesof problems:calculatingprobabilities,assessingstatisticalprocedures,andevaluatingintegrals.Theseareexplainedandexempliedinthenextseveralsubsections. 1.7.1CalculatingProbabilities Probabilitiescanoftenbeestimatedbycomputersimulation.Simulationsareespeciallyusefulforeventssocomplicatedthattheirprobabilitiescannotbeeasily calculatedbyhand,butcomposedofsmallereventsthatareeasilymimickedon computer.Forinstance,inExample1.2wewantedtoknowtheprobabilitythat theshooterinacrapsgamerollseither7or11ontheComeOutroll.Although it'seasyenoughtocalculatethisprobabilityexactly,wediditbysimulationinthe Example. Expectedvaluescanalsobeestimatedbysimulations.Let Y bearandom variableandsupposewewanttoestimatetheexpectedvalueofsomefunction g E g Y .Wecanwriteacomputerprogramtosimulate Y manytimes.To keeptrackofthesimulationsweusethenotation y j forthe j 'thsimulatedvalue of Y .Let n bethenumberofsimulations.Then ^ g = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X g y i isasensibleestimateof g .Infact,theLawofLargeNumberstellsusthat lim n !1 n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X g y i = g :

PAGE 71

1.7.SIMULATION 58 Soaswedoalargerandlargersimulationwegetamoreandmoreaccurate estimateof g Probabilitiescanbecomputedasspecialcasesofexpectations.Supposewe wanttocalculate P[ Y 2 S ] forsomeset S .Dene X 1 S Y .Then P[ Y 2 S ]= E X andissensiblyestimatedby numberofoccurencesof S numberoftrials = n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X x i : Example1.10illustrateswiththegameofCraps. Example1.10 Craps,continued Example1.8calculatedthechanceofwinningthegameofCraps.Hereisthe R codeto calculatethesameprobabilitybysimulation. makepoint<-functionpoint{ determined<-F while!determined{#rolluntiloutcomeisdetermined roll<-sumsample6,2,replace=T ifroll==point{ made<-T determined<-T }elseifroll==7{ made<-F determined<-T } }#endwhile returnmade }#endmakepoint sim.craps<-function{ roll<-sumsample6,2,replace=T ifroll==7||roll==11 win<-T elseifroll==2||roll==3||roll==12 win<-F else win<-makepointroll returnwin

PAGE 72

1.7.SIMULATION 59 } n.sim<-1000 wins<-0 foriin1:n.sim wins<-wins+sim.craps printwins/n.sim is R 'ssymbolfor not .If determined is T then !determined is F while!determined beginsaloop.Theloopwillrepeatasmanytimesas necessaryaslongas !determined is T Trytheexamplecodeafewtimes.Seewhetheryougetabout49%asExample1.8 suggests. Alongwiththeestimateitself,itisusefultoestimatetheaccuracyof ^ g asan estimateof g .Ifthesimulationsareindependentthen Var^ g = n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 Var g Y ; iftherearemanyofthemthen Var g Y canbewellestimatedby n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 P g y )]TJ/F15 11.9552 Tf -418.18 -14.446 Td [(^ g 2 and SD g Y canbewellestimatedby n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = 2 p P g y )]TJ/F15 11.9552 Tf 12.875 0 Td [(^ g 2 .Because SD 'sdecreaseinproportionto n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 = 2 Seethe CentralLimitTheorem.,ittakesa100foldincreasein n toget,forexample,a10 foldincreaseinaccuracy. Similarreasoningappliestoprobabilities,butwhenwearesimulatingtheoccurenceornonoccurenceofanevent,thenthesimulationsareBernoullitrials,so wehaveamoreexplicitformulaforthevarianceand SD Example1.11 Craps,continued HowaccurateisthesimulationinExample1.10? Thesimulationkeepstrackof X ,thenumberofsuccessesin n.sim trials.Let bethetrueprobabilityofsuccess.Example1.8found : 49 ,butinmostpractical applicationswewon'tknow X Bin n.sim ; Var X = n.sim )]TJ/F41 11.9552 Tf 11.955 0 Td [( SD X= n.sim = p )]TJ/F41 11.9552 Tf 11.955 0 Td [( = n.sim and,bytheCentralLimitTheoremif n.sim islarge, ^ = X= n.sim N ; )]TJ/F41 11.9552 Tf 11.955 0 Td [( = n.sim 1 = 2

PAGE 73

1.7.SIMULATION 60 Whatdoesthismeaninpracticalterms?Howaccurateisthesimulationwhen n.sim = 50 ,or200,or1000,say?Toillustratewedid1000simulationswith n.sim =50 ,then another1000with n.sim =200 ,andthenanother1000with n.sim =1000 TheresultsareshownasaboxplotinFigure1.19.InFigure1.19therearethree boxes ,eachwith whiskers extendingvertically.Theboxfor n.sim =50 showsthatthe medianofthe1000 ^ 'swasjustabout.50thehorizontallinethroughthebox,that 50%ofthe ^ 'sfellbetweenabout.45and.55theupperandlowerendsofthebox,and thatalmostallofthe ^ 'sfellbetweenabout.30and.68theextentofthewhiskers.In comparison,the1000 ^ 'sfor n.sim =200 arespreadoutabouthalfasmuch,andthe 1000 ^ 'sfor n.sim =1000 arespreadoutabouthalfasmuchagain.Thefactorofabout ahalfcomesfromthe n.sim : 5 intheformulafor SD ^ .When n.sim increasesbya factorofabout4,the SD decreasesbyafactorofabout2.SeethenotesforFigure1.19 forafurtherdescriptionofboxplots. Figure1.19:1000simulationsof ^ for n.sim =50,200,1000 Hereisthe R codeforthesimulationsandFigure1.19. N<-1000 n.sim<-c50,200,1000

PAGE 74

1.7.SIMULATION 61 theta.hat<-matrixNA,N,lengthn.sim foriinseqalong=n.sim{ forjin1:N{ wins<-0 forkin1:n.sim[i] wins<-wins+sim.craps theta.hat[j,i]<-wins/n.sim[i] } } boxplottheta.hat~coltheta.hat,names=n.sim, xlab="n.sim" matrix formsamatrix.Theformis matrixx,nrows,ncols where x are theentriesinthematrixand nrows and ncols arethenumbersofrowsand columns. seqalong=n.sim isthesameas 1:lengthn.sim exceptthatitbehaves moresensiblyincase lengthn.sim is0. A boxplot isonewaytodisplayadataset.Itproducesarectangle,or box withalinethroughthemiddle.Therectanglecontainsthecentral50%ofthe data.Thelineindicatesthemedianofthedata.Extendingverticallyabove andbelowtheboxaredashedlinescalled whiskers .Thewhiskerscontain mostoftheouter50%ofthedata.Afewextremedatapointsareplotted singly.SeeExample2.3foranotheruseofboxplotsandafullerexplanation. 1.7.2EvaluatingStatisticalProcedures Simulationcansometimesbeusefulindecidingwhetheraparticularexperiment isworthwhileorinchoosingamongseveralpossibleexperimentsorstatisticalprocedures.Foractitiousexample,considerABCCollege,where,ofthe10,000 students,30%ofthestudentsaremembersofasororityorfraternitygreeksand 70%areindependents.ThereisanupcomingelectionforHeadofthestudent

PAGE 75

1.7.SIMULATION 62 governanceorganization.Twocandidatesarerunning,DandE.Let G = proportionofgreekssupportingD I = proportionofindependentssupportingD = proportionofallstudentssupportingD Apolliscommisionedtoestimate anditisagreedthatthepollsterwillsample 100students.Threedifferentproceduresareproposed. 1.Randomlysample100students.Estimate ^ 1 = proportionofpolledstudentswhofavorD 2.Randomlysample100students.Estimate ^ G = proportionofpolledgreekssupportingD ^ I = proportionofpolledindependentssupportingD ^ 2 = : 3 ^ G + : 7 ^ I 3.Randomlysample30greeksand70independents.Estimate ^ G = proportionofpolledgreekssupportingD ^ I = proportionofpolledindependentssupportingD ^ 3 = : 3 ^ G + : 7 ^ I Whichprocedureisbest?Onewaytoanswerthequestionisbyexactcalculation, butanotherwayisbysimulation.Inthesimulationwetryeachproceduremany timestoseehowaccurateitis,onaverage.Wemustchoosesometruevalues of G I and underwhichtodothesimulation.Hereissome R codeforthe simulation. #choose"true"theta.gandtheta.i theta.g<-.8 theta.i<-.4 prop.g<-.3 prop.i<-1-prop.g theta<-prop.g*theta.g+prop.i*theta.i

PAGE 76

1.7.SIMULATION 63 sampsize<-100 n.times<-1000#shouldbeenough theta.hat<-matrixNA,n.times,3 foriin1:n.times{ theta.hat[i,1]<-sim.1 theta.hat[i,2]<-sim.2 theta.hat[i,3]<-sim.3 } printapplytheta.hat,2,mean boxplottheta.hat~coltheta.hat sim.1<-function{ x<-rbinom,sampsize,theta returnx/sampsize } sim.2<-function{ n.g<-rbinom1,sampsize,prop.g n.i<-sampsize-n.g x.g<-rbinom1,n.g,theta.g x.i<-rbinom1,n.i,theta.i t.hat.g<-x.g/n.g t.hat.i<-x.i/n.i returnprop.g*t.hat.g+-prop.g*t.hat.i } sim.3<-function{ n.g<-sampsize*prop.g n.i<-sampsize*prop.i x.g<-rbinom1,n.g,theta.g x.i<-rbinom1,n.i,theta.i t.hat.g<-x.g/n.g t.hat.i<-x.i/n.i returnprop.g*t.hat.g+-prop.g*t.hat.i }

PAGE 77

1.7.SIMULATION 64 apply appliesafunctiontoamatrix.Inthecodeabove, apply... applies thefunction mean todimension 2 ofthematrix theta.hat .Thatis,itreturns themeanofofeachcolumnof theta.hat Theboxplot,showninFigure1.20showslittlepracticaldifferencebetweenthe threeprocedures. Figure1.20:1000simulationsof ^ underthreepossibleproceduresforconducting apoll Thenextexampleshowshowsimulationwasusedtoevaluatewhetheranexperimentwasworthcarryingout. Example1.12 FACE Theamountofcarbondioxide,orCO 2 ,intheEarth'satmospherehasbeensteadily increasingoverthelastcenturyorso.Youcanseetheincreaseyourselfinthe co2 datasetthatcomeswith R .Typing ts.plotco2 makesatimeseriesplot,reproduced hereasFigure1.21.Typing helpco2 givesabriefexplanation.Thedataarethe concentrationsofCO 2 intheatmospheremeasuredatMaunaLoaeachmonthfrom 1959to1997.Theplotshowsasteadilyincreasingtrendimposedonaregularannual cycle.Theprimaryreasonfortheincreaseisburningoffossilfuels.CO 2 isagreenhouse gasthattrapsheatintheatmosphereinsteadoflettingitradiateout,soanincreasein atmosphericCO 2 willeventuallyresultinanincreaseintheEarth'stemperature.But

PAGE 78

1.7.SIMULATION 65 whatishardertopredictistheeectontheEarth'splants.Carbonisanutrientneeded byplants.ItispossiblethatanincreaseinCO 2 willcauseanincreaseinplantgrowth whichinturnwillpartlyabsorbsomeoftheextracarbon. TolearnaboutplantgrowthunderelevatedCO 2 ,ecologistsbeganbyconducting experimentsingreenhouses.Inagreenhouse,twosetsofplantscouldbegrownunder conditionsthatareidenticalexceptfortheamountofCO 2 intheatmosphere.Butthe controlledenvironmentofagreenhouseisquiteunliketheuncontrollednaturalenvironment,so,togainverisimilitude,experimentssoonmovedtoopen-topchambers.An open-topchamberisaspace,typicallyafewmetersindiameter,enclosedbyasolid, usuallyplastic,wallandopenatthetop.CO 2 canbeaddedtotheairinsidethechamber.Becausethechamberismostlyenclosed,notmuchCO 2 willescape,andmorecan beaddedasneeded.SomeplantscanbegrowninchamberswithexcessCO 2 othersin chamberswithoutandtheirgrowthcompared.Butaswithgreenhouses,open-topchambersarenotcompletelysatisfactory.Ecologistswantedtoconductexperimentsunder evenmorenaturalconditions. Tothatend,inthelate1980'stheOceofBiologicalandEnvironmentalResearch intheU.S.DepartmentofEnergyDOEbegansupportingresearchusingatechnologycalledFACE,or FreeAirCO 2 Enrichment developedattheBrookhavenNational Laboratory.Asthelab'swebpage www.face.bnl.gov/face1.htm explains FACEprovidesatechnologybywhichthemicroclimatearoundgrowing plantsmaybemodiedtosimulateclimatechangeconditions.Typically CO2-enrichedairisreleasedfromacircleofverticalpipesintoplotsupto 30mindiameter,andastallas20m. Fastfeedbackcontrolandpre-dilutionofCO2providestable,elevated [CO2]simulatingclimatechangeconditions. NocontainmentisrequiredwithFACEequipmentandthereisnosignicantchangeinnaturalair-ow.LargeFACEplotsreduceeectsofplotedge andcapturefully-functioning,integratedecosystem-scaleprocesses.FACE Fielddatarepresentplantandecosystemsresponsestoconcentrationsof atmosphericCO2expectedinthemid-twenty-rstcentury. Seethewebsiteforpicturesandmoreinformation.InaFACEexperiment,CO 2 isreleased intosometreatmentplots.ThelevelofCO 2 insidetheplotiscontinuallymonitored. MoreCO 2 isreleasedasneededtokeeptheamountofCO 2 intheatmosphereatsome prespeciedlevel,typicallythelevelthatisexpectedinthemid-21stcentury.Otherplots arereservedascontrolplots.Plantgrowthinthetreatmentplotsiscomparedtothatin thecontrolplots.

PAGE 79

1.7.SIMULATION 66 Figure1.21:MonthlyconcentrationsofCO 2 atMaunaLoa

PAGE 80

1.7.SIMULATION 67 BecauseaFACEsiteisnotenclosed,CO 2 continuallydriftsoutofthesiteandneeds tobereplenished.KeepingenoughCO 2 intheairisverycostlyandis,infact,themajor expenseinconductingFACEexperiments. TherstseveralFACEsiteswereinArizonasorghum,wheat,cotton,Switzerland ryegrass,cloverandCalifornianativechaparral.Allofthesecontainedlow-growing plants.Bytheearly1990's,ecologistswantedtoconductaFACEexperimentinaforest, andsuchanexperimentwasproposedbyinvestigatorsatDukeUniversity,totakeplacein DukeForest.Butbeforetheexperimentcouldbefundedtheinvestigatorshadtoconvince theDepartmentofEnergyDOEthatitwouldbeworthwhile.Inparticular,theywanted todemonstratethattheexperimentwouldhaveagoodchanceofuncoveringwhatever growthdierenceswouldexistbetweentreatmentandcontrol.Thedemonstrationwas carriedoutbycomputersimulation.Thecodeforthatdemonstration,slightlyeditedfor clarity,isgivenattheendofthisExampleandexplainedbelow. 1.Theexperimentwouldconsistof6sites,dividedinto3pairs.Onesiteineachpair wouldreceivetheCO 2 treatment;theotherwouldbeacontrol.Theexperiment wasplannedtorunfor10years.Investigatorshadidentied16potentialsites inDukeForest.Theabovegroundbiomassofthosesites,measuredbeforethe experimentbegan,isgivenintheline b.mass<-c... 2.Thecodesimulates1000repetitionsoftheexperiment.That'sthemeaningof nreps<-1000 3.Theabovegroundbiomassofeachsiteisstoredin M.actual.control and M.actual.treatment .Theremustberoomtostorethebiomassofeachsite foreverycombinationofpair,year,repetition.The array... commandcreatesamultidimensionalmatrix,or array ,lledwith NA 's.Thedimensionsaregiven by cnpairs,nyears+1,nreps 4.Asite'sactualbiomassisnotknownexactlybutismeasuredwitherror.The simulatedmeasurementsarestoredin M.observed.control and M.observed.treatment 5.Eachrepetitionbeginsbychoosing6sitesfromamongthe16available.Theirobservedbiomassgoesinto temp .Therstthreevaluesareassignedto M.observed.control andthelastthreetoM.observed.treatment.Allthishappensinaloop foriin 1:nreps 6.Investigatorsexpectedthatcontrolplotswouldgrowatanaveragerateof2%per yearandtreatmentplotsatanaverageofsomethingelse.Thosevaluesarecalled

PAGE 81

1.7.SIMULATION 68 betaC and betaT .Thesimulationwasrunwith betaT =1.04,1.06,1.08shown belowand1.10.Eachsitewouldhaveitsowngrowthratewhichwouldbeslightly dierentfrom betaC or betaT .Forcontrolsites,thoseratesaredrawnfromthe N betaC ; 0 : 1 betaC )]TJ/F15 11.9552 Tf 10.194 0 Td [(1 distributionandstoredin beta.control ,andsimilarly forthetreatmentsites. 7.MeasurementerrorsofbiomasswereexpectedtohaveanSDaround5%.That's sigmaE .Butateachsiteineachyearthemeasurementerrorwouldbeslightly dierent.ThemeasurementerrorsaredrawnfromtheN ; sigmaE distribution andstoredin errors.control and errors.treatment 8.Nextwesimulatetheactualbiomassofthesites.Fortherstyearwherewe alreadyhavemeasurementsthat's M.actual.control[,1,]<-... M.actual.treatment[,1,]<-... Forsubsequentyearsthebiomassinyear i isthebiomassinyear i-1 multiplied bythegrowthfactor beta.control or beta.treatment .Biomassissimulated intheloop foriin2:nyears+1 9.Measuredbiomassistheactualbiomassmultipliedbymeasurementerror.Itis simulatedby M.observed.control<-... M.observed.treatment<-... 10.Thesimulationsforeachyearwereanalyzedeachyearbya two-samplet-test which looksattheratio biomassinyear i biomassinyear 1 toseewhetheritissignicantlylargerfortreatmentsitesthanforcontrolsites.See Sectionxyzfordetailsaboutt-tests.Forourpurposeshere,wehavereplacedthe t-testwithaplot,Figure1.22,whichshowsaclearseparationbetweentreatment andcontrolsitesafterabout5years. TheDOEdiddecidetofundtheproposalforaFACEexperimentinDukeForest,atleast partlybecauseofthedemonstrationthatsuchanexperimentwouldhaveareasonably largechanceofsuccess.

PAGE 82

1.7.SIMULATION 69 Figure1.22:1000simulationsofaFACEexperiment.The x -axisisyears.The y -axisshowsthemeangrowthratebiomassinyear i /biomassinyear1of controlplantslowersolidlineandtreatmentplantsuppersolidline.Standard deviationsareshownasdashedlines.

PAGE 83

1.7.SIMULATION 70 ######################################################## #ApoweranalysisoftheFACEexperiment # #InitialmeasuredbiomassofpotentialFACEsitesing/m2: b.mass<-c17299.1,17793.1,23211.7,23351.8,24278, 25335.9,27001.5,27113.6,30184.3,30625.5, 33496.2,33733.76,35974.3,38490.8,40319.6, 44903 npairs<-3 nyears<-10 nreps<-1000 M.observed.control<-arrayNA,cnpairs,nyears+1,nreps M.actual.control<-arrayNA,cnpairs,nyears+1,nreps M.observed.treatment<-arrayNA,cnpairs,nyears+1,nreps M.actual.treatment<-arrayNA,cnpairs,nyears+1,nreps #Specifytheinitiallevelsofbiomass foriin1:nreps{ temp<-sampleb.mass,size=2*npairs M.observed.control[,1,i]<-temp[1:npairs] M.observed.treatment[,1,i]<-temp[npairs+1:*npairs] } #Specifythebetas betaC<-1.02 betaT<-1.08 beta.control<-matrixrnormnpairs*nreps,betaC, 0.1*betaC-1, npairs,nreps beta.treatment<-matrixrnormnpairs*nreps,betaT, 0.1*betaT-1, npairs,nreps

PAGE 84

1.7.SIMULATION 71 ############################################################# #measurementerrorsinbiomass sigmaE<-0.05 errors.control<-arrayrnormnpairs*nyears+1*nreps, 1,sigmaE, cnpairs,nyears+1,nreps errors.treatment<-arrayrnormnpairs*nyears+1*nreps, 1,sigmaE, cnpairs,nyears+1,nreps ############################################################## ############################################################## #Generate10yearsofdata.Themodelforgenerationis: #M.actual[i,j,]:abovegroundbiomassinringi,yearj #M.actual[i,j+1,]=beta[i]*M.actual[i,j,] #WeactuallyobserveM.observed[i,j]=M.actual[i,j]*error #StartwithM.observed[i,1]andgenerateM.actual[i,1] M.actual.control[,1,]
PAGE 85

1.8. R 72 M.observed.treatment<-M.actual.treatment*errors.treatment ############################################################## ############################################################## #two-samplet-testonM.observed[j]/M.observed[1]removed #plotadded ratio.control<-matrixNA,nyears,npairs*nreps ratio.treatment<-matrixNA,nyears,npairs*nreps foriin2:nyears+1{ ratio.control[i-1,]
PAGE 86

1.8. R 73 owncomputerandtryouttheanalysisin R todevelopyourfamiliaritywithwhat willprovetobeaveryusefultool.Thedatacanbefoundat StatLib ,anonlinerepositoryofstatisticaldataandsoftware.ThedatawereoriginallycontributedbyRogerJohnsonoftheDepartmentofMathematicsandComputerScienceattheSouthDakotaSchoolofMinesandTechnology.The StatLib websiteis lib.stat.cmu.edu .Ifyougoto StatLib andfollowthelinksto datasets andthen bodyfat youwillndalecontainingboththedataandanexplanation.Copyjust thedatatoatextlenamed bodyfat.dat onyourowncomputer.Theleshould containjustthedata;therstfewlinesshouldlooklikethis: 1.070812.323... 1.08536.122... 1.041425.322... Thefollowingsnippetshowshowtoreadthedatainto R andsaveitinto bodyfat bodyfat<-read.table"bodyfat.dat", col.names=c"density","percent.fat","age","weight", "height","neck.circum","chest.circum","abdomen.circum", "hip.circum","thigh.circum","knee.circum","ankle.circum", "bicep.circum","forearm.circum","wrist.circum" dimbodyfat#howmanyrowsandcolumnsinthedataset? namesbodyfat#namesofthecolumns read.table... readsdatafromatextleintoa dataframe .A dataframe isaexiblewaytorepresentdatabecauseitcanbetreatedaseitheramatrixoralist.Type helpread.table tolearnmore.Therstargument, "bodyfat.dat" ,tells R whatletoread.Thesecondargument, col.names= c"density",... ,tells R thenamesofthecolumns. dim givesthedimensionofamatrix,adataframe,oranythingelsethathasa dimension.Foramatrixordataframe, dim tellshowmanyrowsandcolumns. names givesthenamesofthings. namesbodyfat shouldtellusthenames density percent.fat ,....It'susedheretocheckthatthedatawereread thewayweintended. Individualelementsofmatricescanbeaccessedbytwo-dimensionalsubscripts suchas bodyfat[1,1] or bodyfat[3,7] inwhichthesubscriptsrefertotherow andcolumnofthematrix.Trythisouttomakesureyouknowhowtwodimensionalsubscriptswork.Ifthecolumnsofthematrixhavenames,thenthesecond

PAGE 87

1.8. R 74 subscriptcanbeaname,asin bodyfat[1,"density"] or bodyfat[3,"chest.circum"] Oftenweneedtorefertoanentirecolumnatonce,whichcanbedonebyomittingtherstsubscript.Forexample, bodyfat[,2] referstotheentiresetof252 measurementsofpercentbodyfat. A dataframe isalistofcolumns.Because bodyfat has15columnsitslength, lengthbodyfat ,is15.Membersofalistcanbeaccessedbysubscriptswith doublebrackets,asin bodyfat[[1]] .Eachmemberof bodyfat isavectorof length252.Individualmeasurementscanbeaccessedasin bodyfat[[1]][1] or bodyfat[[3]][7] .Ifthelistmembershavenames,thentheycanbeaccessedas in bodyfat$percent.fat .Notethequotationmarksusedwhentreating bodyfat asamatrixandthelackofquotationmarkswhentreating bodyfat asalist.The namefollowingthe" $ "canbeabbreviated,aslongastheabbreviationisunambiguous.Thus bodyfat$ab works,but bodyfat$a failstodistinguishbetween age and abdomen.circum Beginbydisplayingthedata. parmfrow=c,3#establisha5by3arrayofplots foriin1:15{ histbodyfat[[i]],xlab="",main=namesbodyfat[i] } Althoughit'snotourimmediatepurpose,it'sinterestingtoseewhattherelationshipsareamongthevariables.Try pairsbodyfat Toillustratesomeof R 'scapabilitiesandtoexploretheconceptsofmarginal, jointandconditionaldensities,we'lllookmorecloselyatpercentfatanditsrelation toabdomencircumference.Beginwithahistogramofpercentfat. fat<-bodyfat$per#givethesetwovariablesshortnames abd<-bodyfat$abd#sowecanrefertothemeasily parmfrow=c,1#justneedoneplotnow,not15 histfat We'dliketorescaletheverticalaxistomaketheareaunderthehistogramequal to1,asforadensity. R willdothatbydrawingthehistogramonadensityscale insteadofafrequencyscale.Whilewe'reatit,we'llalsomakethelabelsprettier. WealsowanttodrawaNormalcurveapproximationtothehistogram,sowe'll needthemeanandstandarddeviation. histfat,xlab="",main="percentfat",freq=F

PAGE 88

1.8. R 75 mu<-meanfat sigma<-sqrtvarfat#standarddeviation lo<-mu-3*sigma hi<-mu+3*sigma x<-seqlo,hi,length=50 linesx,dnormx,mu,sigma Thatlooksbetter,butwecandobetterstillbyslightlyenlargingtheaxes.Redraw thepicture,butuse histfat,xlab="",main="percentfat",freq=F, xlim=c-10,60,ylim=c,.06 TheNormalcurvetsthedatareasonablywell.Agoodsummaryofthedatais thatitisdistributedapproximatelyN : 15 ; 8 : 37 Nowexaminetherelationshipbetweenabdomencircumferenceandpercent bodyfat.Trythefollowingcommand. plotabd,fat,xlab="abdomencircumference", ylab="percentbodyfat" Thescatterdiagramshowsaclearrelationshipbetweenabdomencircumference andbodyfatinthisgroupofmen.Onemandoesn'ttthegeneralpattern;he hasacircumferencearound148butabodyfatonlyaround35%,relativelylow forsuchalargecircumference.Toquantifytherelationshipbetweenthevariables, let'sdividethemenintogroupsaccordingtocircumferenceandestimatetheconditionaldistributionoffatgivencircumference.Ifwedividethemenintotwelfths we'llhave21menpergroup. cut.pts<-quantileabd,:12/12 groups<-cutabd,cut.pts,include.lowest=T,labels=1:12 boxplotfat~groups, xlab="quantilesofabdomencircumference", ylab="percentbodyfat" Note: A quantile isageneralizationof median .Forexample,the1/12-thquantileof abd isthenumber q suchthat1/12ofallthe abd measurementsarelessthan q and11/12aregreaterthan q .Amorecarefuldenitionwouldsaywhat todoincaseofties.Themedianisthe.5quantile.Wehavecutourdata accordingtothe1/12,2/12,...,12/12quantilesof abd .

PAGE 89

1.8. R 76 Ifyoudon'tseewhatthe cutabd,... commanddoes,printout cut.pts and groups ,thenlookatthemuntilyougureitout. Boxplots areaconvenientwaytocomparedifferentgroupsofdata.Inthis casethereare12groups.Eachgroupisrepresentedontheplotbyaboxwith whiskers.Theboxspanstherstandthirdquartiles.25and.75quantiles of fat forthatgroup.Thelinethroughthemiddleoftheboxisthemedian fat forthatgroup.Thewhiskersextendtocovermostoftherestofthedata. Afewoutlying fat valuesfalloutsidethewhiskers;theyareindicatedas individualpoints. fatgroups "is R 'snotationfora formula .Itmeanstotreat fat asafunctionof groups .Formulasareextremelyusefulandwillariserepeatedly. Themediansincreaseinnotquitearegularpattern.Theirregularitiesare probablyduetothevagariesofsampling.Wecanndthemean,medianand varianceof fat foreachgroupwith mu.fat<-tapplyfat,groups,mean me.fat<-tapplyfat,groups,median sd.fat<-sqrttapplyfat,groups,var cbindmu.fat,me.fat,sd.fat tapply means"applytoeveryelementofatable."Inthiscase,thetableis fat ,groupedaccordingto groups cbind means"bindtogetherincolumns".Thereisananalgouscommand rbind Finally,let'smakeaguresimilartoFigure1.12. x<-seq0,50,by=1 parmfrow=c,3 foriin1:12 good<-groups==i histfat[good],xlim=c,50,ylim=c,.1, breaks=seq,50,by=5,freq=F, xlab="percentfat",main="" y<-dnormx,mu.fat[i],sd.fat[i] linesx,y

PAGE 90

1.9.SOMERESULTSFORLARGESAMPLES 77 TheNormalcurvesseemtotwell.Wesawearlierthatthemarginal Marginal meansunconditional.distributionofpercentbodyfatiswellapproximatedby N : 15 ; 8 : 37 .Hereweseethattheconditionaldistributionofpercentbodyfat, giventhatabdomencircumferenceisinbetweenthe i )]TJ/F15 11.9552 Tf 12.078 0 Td [(1 = 12 and i= 12 quantiles isN mu.fat[i] ; sd.fat[i] .Ifweknowaman'sabdomencircumferenceeven approximatelythen wecanestimatehispercentbodyfatmoreaccuratelyand thetypicalestimationerrorissmaller. [addsomethingaboutestimation errorinthesdsection] 1.9SomeResultsforLargeSamples Itisintuitivelyobviousthatlargesamplesarebetterthansmall,thatmoredata isbetterthanless,and,lessobviously,thatwithenoughdataoneiseventually ledtotherightanswer.Theseintuitiveobservationshaveprecisemathematical statementsintheformofTheorems1.12,1.13and1.14.Westatethosetheorems heresowecanusethemthroughouttherestofthebook.Theyareexaminedin moredetailinSection8.9. Denition1.9 RandomSample Acollection y 1 ;:::;y n ofrandomvariablesis saidtobea randomsample ofsize n frompdforpmf f ifandonlyif 1. y i f foreach i =1 ; 2 ;:::;n and 2.the y i 'saremutuallyindependent,i.e. f y 1 ;:::;y n = Q n i =1 f y i Thecollection y 1 ;:::;y n iscalleda dataset .Wewrite y 1 ;:::;y n i.i.d. f wherei.i.d.standsfor independentandidenticallydistributed Manyintroductorystatisticstextsdescribehowtocollectrandomsamples,many pitfallsthatawait,andmanypratfallstakenintheattempt.Weomitthatdiscussion hereandrefertheinterestedreadertoourfavoriteintroductorytextonthesubject, Freedmanetal.[1998]whichhasanexcellentdescriptionofrandomsamplingin generalaswellasdetaileddiscussionoftheUScensusandtheCurrentPopulation Survey. Suppose y 1 ;y 2 ;:::; i.i.d. f .Let = R yf y dy and 2 = R y )]TJ/F41 11.9552 Tf 12.818 0 Td [( 2 f y dy bethemeanandvarianceof f .Typically and areunknownandwetakethe sampleinordertolearnaboutthem.Wewilloftenuse y n y 1 + + y n =n ,the meanoftherst n observations,toestimate .Somequestionstoconsiderare

PAGE 91

1.9.SOMERESULTSFORLARGESAMPLES 78 Forasampleofsize n ,howaccurateis y n asanestimateof ? Does y n getcloserto as n increases? Howlargemust n beinordertoachieveadesiredlevelofaccuracy? Theorems1.12and1.14provideanswerstothesequestions.Beforestatingthem weneedsomepreliminaryresultsaboutthemeanandvarianceof y n Theorem1.7. Let x 1 ;:::;x n berandomvariableswithmeans 1 ;:::; n .Then E [ x 1 + + x n ]= 1 + + n Proof. Itsufcestoprovethecase n =2 E [ x 1 + x 2 ]= ZZ x 1 + x 2 f x 1 ;x 2 dx 1 dx 2 = ZZ x 1 f x 1 ;x 2 dx 1 dx 2 + ZZ x 2 f x 1 ;x 2 dx 1 dx 2 = 1 + 2 Corollary1.8. Let y 1 ;:::;y n bearandomsamplefrom f withmean .Then E [ y n ]= Proof. ThecorollaryfollowsfromTheorems1.3and1.7. Theorem1.9. Let x 1 ;:::;x n beindependentrandomvariableswithmeans 1 ;:::; n andSDs 1 ;:::; n .Then Var[ x 1 + + x n ]= 2 1 + + 2 n Proof. Itsufcestoprovethecase n =2 .UsingTheorem1.2, Var X 1 + X 2 = E X 1 + X 2 2 )]TJ/F15 11.9552 Tf 11.955 0 Td [( 1 + 2 2 = E X 2 1 +2 E X 1 X 2 + E X 2 2 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 1 )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 1 2 )]TJ/F41 11.9552 Tf 11.956 0 Td [( 2 2 = )]TJ/F52 11.9552 Tf 5.48 -9.684 Td [(E X 2 1 )]TJ/F41 11.9552 Tf 11.956 0 Td [( 2 1 + )]TJ/F52 11.9552 Tf 5.48 -9.684 Td [(E X 2 2 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 2 +2 E X 1 X 2 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 1 2 = 2 1 + 2 2 +2 E X 1 X 2 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 1 2 : Butif X 1 ? X 2 then E X 1 X 2 = ZZ x 1 x 2 f x 1 ;x 2 dx 1 dx 2 = Z x 1 Z x 2 f x 2 dx 2 f x 1 dx 1 = 2 Z x 1 f x 1 dx 1 = 1 2 : So Var X 1 + X 2 = 2 1 + 2 2

PAGE 92

1.9.SOMERESULTSFORLARGESAMPLES 79 NotethatTheorem1.9requiresindependencewhileTheorem1.7doesnot. Corollary1.10. Let y 1 ;:::;y n bearandomsamplefrom f withvariance 2 .Then Var y n = 2 =n Proof. ThecorollaryfollowsfromTheorems1.4and1.9. Theorem1.11 Chebychev'sInequality Let X bearandomvariablewithmean andSD .Thenforany > 0 P[ j X )]TJ/F41 11.9552 Tf 11.956 0 Td [( j ] 2 = 2 : Proof. 2 = Z x )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 f x dx = Z )]TJ/F42 7.9701 Tf 6.587 0 Td [( x )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 f x dx + Z + )]TJ/F42 7.9701 Tf 6.587 0 Td [( x )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 f x dx + Z 1 + x )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 f x dx Z )]TJ/F42 7.9701 Tf 6.586 0 Td [( x )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 f x dx + Z 1 + x )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 f x dx 2 Z )]TJ/F42 7.9701 Tf 6.587 0 Td [( f x dx + 2 Z 1 + f x dx = 2 P[ j X )]TJ/F41 11.9552 Tf 11.955 0 Td [( j ] : Theorems1.12and1.14arethetwomainlimittheoremsofstatistics.They provideanswers,atleastprobabilistically,tothequestionsonpage77. Theorem1.12 WeakLawofLargeNumbers Let y 1 ;:::;y n bearandomsample fromadistributionwithmean andvariance 2 .Thenforany > 0 lim n !1 P[ j y n )]TJ/F41 11.9552 Tf 11.955 0 Td [( j < ]=1 : .18 Proof. ApplyChebychev'sInequalityTheorem1.11to y n lim n !1 P[ j y n )]TJ/F41 11.9552 Tf 11.955 0 Td [( j < ]=lim n !1 1 )]TJ/F15 11.9552 Tf 11.955 0 Td [(P[ j y n )]TJ/F41 11.9552 Tf 11.955 0 Td [( j ] lim n !1 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 =n 2 =1 : AnotherversionofTheorem1.12iscalledtheStrongLawofLargeNumbers.

PAGE 93

1.9.SOMERESULTSFORLARGESAMPLES 80 Theorem1.13 StrongLawofLargeNumbers Let y 1 ;:::;y n bearandomsample fromadistributionwithmean andvariance 2 .Thenforany > 0 P[lim n !1 j Y n )]TJ/F41 11.9552 Tf 11.956 0 Td [( j < ]=1; i.e., P[lim n !1 Y n = ]=1 : Itisbeyondthescopeofthissectiontoexplainthedifferencebetweenthe WLLNandtheSLLN.SeeSection8.9. Theorem1.14 CentralLimitTheorem Let y 1 ;:::;y n bearandomsamplefrom f withmean andvariance 2 .Let z n = y n )]TJ/F41 11.9552 Tf 12.603 0 Td [( = = p n .Then,foranynumbers a
PAGE 94

1.10.EXERCISES 81 ahistogramof1000simulations,allusing n.sim =50 .Forasinglesimulation with n.sim =50 let X 1 ;:::;X 50 betheoutcomesofthosesimulations.Each X i Bern : 493 ,so = : 493 and = p : 493 : 507 : 5 .Therefore,accordingtothe CentralLimitTheorem,when n.sim =50 X n N ;= p n N : 493 ;: 071 ThisistheNormaldensityplottedintheupperpanelofFigure1.23.Weseethat theN : 493 ;: 071 isagoodapproximationtothehistogram.Andthat'sbecause ^ = X 50 N : 493 ;: 071 ,approximately.TheCentralLimitTheoremsaysthatthe approximationwillbegoodforlarge n .Inthiscase n =50 islargeenough. Section8.1.4willdiscussthequestionofwhen n islargeenough. Similarly, X n N : 493 ;: 035 when n.sim =200 X n N : 493 ;: 016 when n.sim =1000 : ThesedensitiesareplottedinthemiddleandlowerpanelsofFigure1.23. TheCentralLimitTheoremmakesthreestatementsaboutthedistributionof y n z n inlargesamples: 1. E [ y n ]= E [ z n ]=0 2. SD y n = = p n SD z n =1 ,and 3. y n z n has,approximately,aNormaldistribution. ThersttwoofthesearealreadyknownfromTheorems1.7and1.9.It'sthethird pointthatiskeytotheCentralLimitTheorem.Anothersurprisingimplicationfrom theCentralLimitTheoremisthatthedistributionsof y n and z n inlargesamples aredeterminedsolelyby and ;nootherfeaturesof f matter. 1.10Exercises 1.Show:if isaprobabilitymeasurethenforanyinteger n 2 ,anddisjoint sets A 1 ;:::;A n n [ i =1 A i = n X i =1 A i : 2. SimulatingDiceRolls

PAGE 95

1.10.EXERCISES 82 Figure1.23:Histogramsofcrapssimulations.SolidcurvesareNormalapproximationsaccordingtotheCentralLimitTheorem.

PAGE 96

1.10.EXERCISES 83 asimulate6000dicerolls.Countthenumberof1's,2's,...,6's. bYouexpectabout1000ofeachnumber.Howclosewasyourresultto whatyouexpected? cAbouthowoftenwouldyouexpecttogetmorethan10301's?Runan R simulationtoestimatetheanswer. 3. TheGameofRisk Intheboardgame Risk playersplacetheirarmiesindifferentcountriesandtryeventuallytocontrolthewholeworldbycapturing countriesoneatatimefromotherplayers.Tocaptureacountry,aplayer mustattackitfromanadjacentcountry.IfplayerAhas A 2 armiesin country A ,shemayattackadjacentcountry D .Attacksaremadewithfrom1 to3armies.Sinceatleast1armymustbeleftbehindintheattackingcountry, Amaychoosetoattackwithaminimumof1andamaximumof min ;A )]TJ/F15 11.9552 Tf 10.519 0 Td [(1 armies.IfplayerDhas D 1 armiesincountry D ,hemaydefendhimself againstattackusingaminimumof1andamaximumof min ;D armies. Itisalmostalwaysbesttoattackanddefendwiththemaximumpermissible numberofarmies. WhenplayerAattackswith a armiessherolls a dice.WhenplayerDdefends with d armiesherolls d dice.A'shighestdieiscomparedtoD'shighest.If bothplayersuseatleasttwodice,thenA'ssecondhighestisalsocompared toD'ssecondhighest.Foreachcomparison,ifA'sdieishigherthanD'sthen AwinsandDremovesonearmyfromtheboard;otherwiseDwinsandA removesonearmyfromtheboard.Whentherearetwocomparisons,atotal oftwoarmiesareremovedfromtheboard. IfAattackswithonearmyshehastwoarmiesincountryA,somay onlyattackwithoneandDdefendswithonearmyhehasonlyone armyincountryDwhatistheprobabilitythatAwillwin? SupposethatPlayer1hastwoarmieseachincountries C 1 C 2 C 3 and C 4 ,thatPlayer2hasonearmyeachincountries B 1 B 2 B 3 and B 4 ,and thatcountry C i attackscountry B i .WhatisthechancethatPlayer1will besuccessfulinatleastoneofthefourattacks? 4.aJustifythelaststepofEquation1.2. bJustifythelaststepoftheproofofTheorem1.1. cProveTheorem1.1when g isadecreasingfunction.

PAGE 97

1.10.EXERCISES 84 5. Y isarandomvariable. Y 2 )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 ; 1 .Thepdfis p y = ky 2 forsomeconstant, k aFind k bUse R toplotthepdf. cLet Z = )]TJ/F41 11.9552 Tf 9.299 0 Td [(Y .Findthepdfof Z .Plotit. 6. U isarandomvariableontheinterval [0 ; 1] ; p u =1 a V = U 2 .Onwhatintervaldoes V live?Plot V asafunctionof U .Find thepdfof V .Plot p V v asafunctionof v b W =2 U .Onwhatintervaldoes W live?Plot W asafunctionof U .Find thepdfof W .Plot p W w asafunctionof w c X = )]TJ/F15 11.9552 Tf 11.291 0 Td [(log U .Onwhatintervaldoes X live?Plot X asafunctionof U Findthepdfof X .Plot p X x asafunctionof x 7.Let X Exp andlet Y = cX forsomeconstant c aWritedownthedensityof X bFindthedensityof Y cNamethedistributionof Y 8.AteacherrandomlyselectsastudentfromaSta103class.Let X bethenumberofmathcoursesthestudenthascompleted.Let Y =1 ifthestudentis femaleand Y =0 ifthestudentismale.Fiftypercentoftheclassisfemale. Amongthewomen,thirtypercenthavecompletedonemathclass,fortypercenthavecompletedtwomathclassesandthirtypercenthavecompleted three.Amongthemen,thirtypercenthavecompletedonemathclass,fty percenthavecompletedtwomathclassesandtwentypercenthavecompleted three. a True or False : X and Y areindependent. bFindE [ X j Y =1] 9.SueisstudyingtheBin,.4distribution.In R shetypes y<-rbinom,25,.4 m1<-meany m2<-sumy/25

PAGE 98

1.10.EXERCISES 85 m3<-sumy-m1^2/50 aIs y anumber,avectororamatrix? bWhatistheapproximatevalueof m1 ? cWhatistheapproximatevalueof m2 ? dWhatwasSuetryingtoestimatewith m3 ? 10.Therandomvariables X and Y havejointpdf f X;Y x;y =1 inthetriangleof the XY -planedeterminedbythepoints-1,0,,0,and,1. Hint:Draw apicture. aFind f X : 5 bFind f Y y cFind f Y j X y j X = : 5 dFindE [ Y j X = : 5] eFind f Y : 5 fFind f X x gFind f X j Y x j Y = : 5 hFindE [ X j Y = : 5] 11. X and Y areuniformlydistributedintheunitdisk.I.e.,thejointdensity p x;y isconstantontheregionof R 2 suchthat x 2 + y 2 1 aFind p x;y bAre X and Y independent? cFindthemarginaldensities p x and p y dFindtheconditionaldensities p x j y and p y j x eFind E [ X ] E [ X j Y = : 5] ,and E [ X j Y = )]TJ/F41 11.9552 Tf 9.299 0 Td [(: 5] 12.VerifytheclaiminExample1.4that argmax P[ x =3 j ]=3 .Hint:differentiateEquation1.6. 13.a p isthepdfofacontinuousrandomvariable w .Find R R p s ds bFind R R p s ds forthepdfinEquation1.7.

PAGE 99

1.10.EXERCISES 86 14.Page7saysEverypdfmustsatisfytwoproperties...andthatoneofthem is p y 0 forall y .Explainwhythat'snotquiteright. 15. p y = 1 p 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 y 2 isthepdfofacontinuousrandomvariable y .Find R 0 p s ds 16.Whenspun,anunbiasedspinnerpointstosomenumber y 2 ; 1] .Whatis p y ? 17.Someexercisesonthedensitiesoftranformedvariables.Oneofthemshould illustratetheneedfortheabsolutevalueoftheJacobian. 18.aProve:if X Poi then E X = .Hint:usethesametrickweused toderivethemeanoftheBinomialdistribution. bProve:if X N ; then E X = .Hint:changevariablesinthe integral. 19.aProve:if X Bin n;p then Var X = np )]TJ/F41 11.9552 Tf 10.204 0 Td [(p .Hint:useTheorem1.9. bProve:if X Poi then Var X = .Hint:usethesametrickweused toderivethemeanoftheBinomialdistributionandTheorem1.2. cIf X Exp ,nd Var X .Hint:useTheorem1.2. dIf X N ; ,nd Var X .Hint:useTheorem1.2. 20.aJustifyeachstepofEquation1.12. bJustifyeachstepofEquation1.13.Hint:integratebyparts. 21.Let X 1 Bin ;: 1 X 2 Bin ;: 9 and X 1 ? X 2 .Dene Y = X 1 + X 2 Does Y havetheBin ;: 5 distribution?Whyorwhynot? 22.Let X 1 Bin ;: 5 X 2 Bin ;: 5 and X 1 ? X 2 .Dene Y 1 = X 1 + X 2 and Y 2 =2 X 1 .Whichhasthebiggermean: Y 1 or Y 2 ?Whichhasthebigger variance: Y 1 or Y 2 ?Justifyyouranswer. 23.Considercustomersarrivingataservicecounter.Interarrivaltimesoften haveadistributionthatisapproximatelyexponentialwithaparameter that dependsonconditionsspecictotheparticularcounter.I.e., p t = e )]TJ/F42 7.9701 Tf 6.587 0 Td [(t Assumethatsuccessiveinterarrivaltimesareindependentofeachother.Let T 1 bethearrivaltimeofthenextcustomerand T 2 betheadditionaltimeuntil thearrivalofthesecondcustomer.

PAGE 100

1.10.EXERCISES 87 aWhatisthejointdensityof T 1 ;T 2 ? bLet S = T 1 + T 2 ,thetimeuntilthenexttwocustomersarrive.Whatis P[ S 5] ;i.e.theprobabilitythatatleast2customersarrivewithinthe next5minutes? cWhatis E S ? 24.AgamblerplaysataroulettetablefortwohoursbettingonRedateachspin ofthewheel.Thereare60spinsduringthetwohourperiod.Whatisthe distributionof a z ,thenumberoftimesthegamblerwins, b y ,thenumberoftimesthegamblerloses, c x ,thenumberoftimesthegamblerisjostledbythepersonstanding behind, d w ,thegambler'snetgain? 25.IfhumanDNAcontains xxx bases,andifeachbasemutateswithprobability p overthecourseofalifetime,whatistheaveragenumberofmutationsper person?Whatisthevarianceofthenumberofmutationsperperson? 26.Isaacisin5thgrade.Eachsentencehewritesforhomeworkhasa90% chanceofbeinggrammaticallycorrect.Thecorrectnessofonesentencedoes notaffectthecorrectnessofanyothersentence.Herecentlywrotea10 sentenceparagraphforawritingassignment.Writeaformulaforthechance thatnomorethantwosentencesaregrammaticallyincorrect. 27.TeamsAandBplayeachotherintheWorldSeriesofbaseball.TeamAhas a60%chanceofwinningeachgame.WhatisthechancethatBwinsthe series?Thewinneroftheseriesistherstteamtowin4games. 28.Abasketballplayershootstenfreethrowsinagame.Shehasa70%chance ofmakingeachshot.Ifshemissestheshot,herteamhasa30%chanceof gettingtherebound. aLet m bethenumberofshotsshemakes.Whatisthedistributionof m ? Whatareitsexpectedvalueandvariance?Whatisthechancethatshe makessomewherebetween5and9shots,inclusive?

PAGE 101

1.10.EXERCISES 88 bLet r bethenumberofreboundsherteamgetsfromherfreethrows. Whatisthedistributionof r ?Whatareitsexpectedvalueandvariance? Whatisthechancethat r 1 ? 29.Let x;y havejointdensityfunction f x;y .Therearetwowaystond E y Onewayistoevaluate RR yf x;y x;y dxdy .Theotheristostartwiththejoint density f x;y ,ndthemarginaldensity f y ,thenevaluate R yf y y dy .Show thatthesetwomethodsgivethesameanswer. 30.ProveTheorem1.3pg.39inthediscretecase. 31.ProveTheorem1.7pg.78inthecontinuouscase. 32.Aresearcherrandomlyselectsmother-daughterpairs.Let x i and y i bethe heightsofthe i 'thmotheranddaughter,respectively.TrueorFalse: a x i and x j areindependent b x i and y j areindependent c y i and y j areindependent d x i and y i areindependent 33.AspartofhismathhomeworkIsaachadtorolltwodiceandrecordthe results.LetX1betheresultoftherstdieandX2betheresultofthesecond. WhatistheprobabilitythatX1=1giventhatX1+X2=5? 34.AdoctorsuspectsapatienthastheraremedicalconditionDS,ordisstaticularia,theinabilitytolearnstatistics.DSoccursin.01%ofthepopulation,or onepersonin10,000.Thedoctorordersadiagnostictest.Thetestisquite accurate.AmongpeoplewhohaveDSthetestyieldsapositiveresult99%of thetime.AmongpeoplewhodonothaveDSthetestyieldsapositiveresult only5%ofthetime. Forthepatientinquestion,thetestresultispositive.Calculatetheprobability thatthepatienthasDS. 35.Forvariousreasons,researchersoftenwanttoknowthenumberofpeoplewhohaveparticipatedinembarassingactivitiessuchasillegaldruguse, cheatingontests,robbingbanks,etc.Anopinionpollwhichasksthesequestionsdirectlyislikelytoelicitmanyuntruthfulanswers.Togetaroundthe problem,researchershavedevisedthemethodofrandomizedresponse.The followingscenarioillustratesthemethod.

PAGE 102

1.10.EXERCISES 89 Apollsteridentiesarespondentandgivesthefollowinginstructions.Toss acoin,butdon'tshowittome.IfitlandsHeads,answerquestiona.If itlandstails,answerquestionb.Justanswer'yes'or'no'.Donottellme whichquestionyouareanswering. Questiona:Doesyourtelephonenumberendinanevendigit? Questionb:Haveyoueverusedcocaine? Becausetherespondentcananswertruthfullywithoutrevealinghisorher cocaineuse,theincentivetolieisremoved.Researchershoperespondents willtellthetruth. Youmayassumethatrespondentsaretruthfulandthattelephonenumbers areequallylikelytobeoddoreven.Let p betheprobabilitythatarandomly selectedpersonhasusedcocaine. aWhatistheprobabilitythatarandomlyselectedpersonanswers"yes"? bSupposewesurvey100people.Let X bethenumberwhoanswer"yes". Whatisthedistributionof X ? 36.Ina1991articleSeeUtts[1991]anddiscussants.JessicaUttsreviewssome ofthehistoryofprobabilityandstatisticsinESPresearch.Thisquestion concernsaparticularseriesof autoganzfeld experimentsinwhichasender lookingatapicturetriestoconveythatpicturetelepathicallytoareceiver. Uttsexplains: ...`autoganzfeld'experimentsrequirefourparticipants.Therst istheReceiverR,whoattemptstoidentifythetargetmaterialbeingobservedbytheSenderS.TheExperimenterEpreparesR forthetask,elicitstheresponsefromRandsupervisesR'sjudging oftheresponseagainstthefourpotentialtargets.Judgingisdoubleblind;Edoesnotknowwhichisthecorrecttarget.Thefourth participantisthelabassistantLAwhoseonlytaskistoinstruct thecomputertorandomlyselectthetarget.Nooneinvolvedinthe experimentknowstheidentityofthetarget. BothRandSaresequesteredinsound-isolated,electrically sheildedrooms.Rispreparedasinearlierganzfeldstudies,with whitenoiseandaeldofredlight.Inanonadjacentroom,S watchesthetargetmaterialonatelevisionandcanhearR'stargetdescription`mentation'asitisbeinggiven.Thementationis alsotaperecorded.

PAGE 103

1.10.EXERCISES 90 Thejudgingprocesstakesplaceimmediatelyafterthe30-minute sendingperiod.OnaTVmonitorintheisolatedroom,Rviewsthe fourchoicesfromthetargetpackthatcontainstheactualtarget.R isaskedtorateeachoneaccordingtohowcloselyitmatchesthe ganzfeldmentation.Theratingsareconvertedtoranksand,ifthe correcttargetisrankedrst,adirecthitisscored.Theentireprocessisautomaticallyrecordedbythecomputer.Thecomputerthen displaysthecorrectchoicetoRasfeedback. IntheseriesofautoganzfeldexperimentsanalyzedbyUtts,therewereatotal of355trials.Let X bethenumberofdirecthits. aWhatarethepossiblevaluesof X ? bAssumingthereisnoESP,andnocheating,whatisthedistributionof X ? cPlotthepmfofthedistributioninpartb. dFind E [ X ] and SD X eAddaNormalapproximationtotheplotinpartc. fJudgingfromtheplotinpartc,approximatelywhatvaluesof X are consistentwiththenoESP,nocheatinghypothesis? gInfact,thetotalnumberofhitswas x =122 .Whatdoyouconclude? 37.Thisexerciseisbasedonacomputerlabthatanotherprofessorusestoteach theCentralLimitTheorem.Itwasoriginallywrittenin MATLAB buthereit's translatedinto R Enterthefollowing R commands: u<-matrixrunif,1000,250 y<-applyu,2,mean Thesecreatea1000x250athousandrowsandtwohundredftycolumns matrixofrandomdraws,called u anda250-dimensionalvector y whichcontainsthemeansofeachcolumnof U Nowenterthecommand histu[,1] .Thiscommandtakestherstcolumn of u acolumnvectorwith1000entriesandmakesahistogram.Printout

PAGE 104

1.10.EXERCISES 91 thishistogramanddescribewhatitlookslike.Whatdistributionisthe runif commanddrawingfrom? Nowenterthecommand histy .Thiscommandmakesahistogramfrom thevector y .Printoutthishistogram.Describewhatitlookslikeandhow itdiffersfromtheoneabove.Basedonthehistogram,whatdistributiondo youthink y follows? Yougenerated y and u withthe same randomdraws,sohowcantheyhave differentdistributions?What'sgoingonhere? 38.SupposethatextensivetestinghasrevealedthatpeopleinGroupAhaveIQ's thatarewelldescribedbyaN ; 10 distributionwhiletheIQ'sofpeople inGroupBhaveaN ; 10 distribution. WhatistheprobabilitythatarandomlychosenindividualfromGroupAhasahigherIQthanarandomlychosen individualfromGroupB? aWriteaformulatoanswerthequestion.Youdon'tneedtoevaluatethe formula. bWritesome R codetoanswerthequestion. 39.Theso-called MontyHall or Let'sMakeaDeal problemhascausedmuchconsternationovertheyears.Itisnamedforanoldtelevisionprogram.Acontestantispresentedwiththreedoors.Behindonedoorisafabulousprize; behindtheothertwodoorsarevirtuallyworthlessprizes.Thecontestant choosesadoor.Thehostoftheshow,MontyHall,thenopensoneofthe remainingtwodoors,revealingoneoftheworthlessprizes.BecauseMonty isthehost,heknowswhichdoorsconcealtheworthlessprizesandalways choosesoneofthemtoreveal,butneverthedoorchosenbythecontestant. Thenthecontestantisofferedthechoiceofkeepingwhatisbehindheroriginaldoorortradingforwhatisbehindtheremainingunopeneddoor.What shouldshedo? Therearetwopopularanswers. Therearetwounopeneddoors,theyareequallylikelytoconcealthe fabulousprize,soitdoesn'tmatterwhichoneshechooses. Shehada1/3probabilityofchoosingtherightdoorinitially,a2/3 chanceofgettingtheprizeifshetrades,sosheshouldtrade. aCreateasimulationin R todiscoverwhichansweriscorrect.

PAGE 105

1.10.EXERCISES 92 bShowusingformalargumentsofconditionalprobabilitywhichanswer iscorrect. Makesureyouranswerstoaandbagree! 40.ProveTheorem1.6pg.54.

PAGE 106

C HAPTER 2 M ODESOF I NFERENCE 2.1Data Thischaptertakesuptheheartofstatistics:makinginferences,quantitatively,from data.Thedata, y 1 ;:::;y n areassumedtobearandomsamplefromapopulation. InChapter1wereasonedfrom f to Y .Thatis,wemadestatementslikeIf theexperimentislike...,then f willbe...,and y 1 ;:::;y n willlooklike...or E Y mustbe...,etc.InChapter2wereasonfrom Y to f .Thatis,wemake statementssuchasSince y 1 ;:::;y n turnedouttobe...itseemsthat f islikelyto be...,or R yf y dy islikelytobearound...,etc.Thisisabasisforknowledge: learningabouttheworldbyobservingit.Itsimportancecannotbeoverstated.The eldofstatisticsilluminatesthetypeofthinkingthatallowsustolearnfromdata andcontainsthetoolsforlearningquantitatively. Reasoningfrom Y to f worksbecausesamplesareusuallylikethepopulations fromwhichtheycome.Forexample,if f hasameanaround6thenmostreasonablylargesamplesfrom f alsohaveameanaround6,andifoursamplehasa meanaround6thenweinferthat f likelyhasameanaround6.Ifoursamplehas anSDaround10thenweinferthat f likelyhasanSDaround10,andsoon.So muchisobvious.Butcanwebemoreprecise?Ifoursamplehasameanaround 6,thencanweinferthat f likelyhasameansomewherebetween,say,5.5and 6.5,orcanweonlyinferthat f likelyhasameanbetween4and8,orevenworse, betweenabout-100and100?Whencanwesayanythingquantitativeatallabout themeanof f ?Theanswerisnotobvious,andthat'swherestatisticscomesin. Statisticsprovidesthequantitativetoolsforansweringsuchquestions. Thischapterpresentsseveralgenericmodesofstatisticalanalysis. DataDescription Datadescriptioncanbevisual,throughgraphs,charts,etc.,or numerical,throughcalculatingsamplemeans,SD's,etc.Displayingafew 93

PAGE 107

2.2.DATADESCRIPTION 94 simplefeaturesofthedata y 1 ;:::;y n canallowustovisualizethosesame featuresof f .Datadescriptionrequiresfew apriori assumptionsabout f Likelihood Inlikelihoodinferenceweassumethat f isamemberofaparametric familyofdistributions f f : 2 g .Theninferenceabout f isthesame asinferenceabouttheparameter ,anddifferentvaluesof arecompared accordingtohowwell f explainsthedata. Estimation Thegoalofestimationistoestimatevariousaspectsof f ,suchas itsmean,median,SD,etc.Alongwiththeestimate,statisticianstrytogive quantitativemeasuresofhowaccuratetheestimatesare. BayesianInference Bayesianinferenceisawaytoaccountnotjustforthedata y 1 ;:::;y n ,butalsoforotherinformationwemayhaveabout f Prediction Sometimesthegoalofstatisticalanalysisisnottolearnabout f perse buttomakepredictionsabout y 'sthatwewillseeinthefuture.Inaddition totheusualproblemofnotknowing f ,wehavetheadditionalproblemthat evenifweknew f ,westillwouldn'tbeabletopredictfuture y 'sexactly. HypothesisTesting Sometimeswewanttotesthypotheseslike HeadStartisgood forkids or lowertaxesaregoodfortheeconomy or thenewtreatmentisbetter thantheold DecisionMaking Often,decisionshavetobemadeonthebasisofwhatwehave learnedabout f .Inaddition,makinggooddecisionsrequiresaccountingfor thepotentialgainsandlossesofeachdecision. 2.2DataDescription Therearemanyways,bothgraphicalandnumerical,todescribedatasets.Sometimeswe'reinterestedinmeans,sometimesvariations,sometimestrendsthrough time,andtherearegoodwaystodescribeanddisplayalltheseaspectsandmany more.Simpledatadescriptionisoftenenoughtoshedlightonanunderlyingscienticproblem.ThesubsectionsofSection2.2showsomebasicwaystodescribe varioustypesofdata.

PAGE 108

2.2.DATADESCRIPTION 95 2.2.1SummaryStatistics Oneofthesimplestwaystodescribeadatasetisbyalowdimensionalsummary. Forinstance,inExample1.5onoceantemperaturesthereweremultiplemeasurementsoftemperaturesfromeachof9locations.Themeasurementsfromeach locationweresummarizedbythesamplemean y = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 P y i ;comparisonsofthe 9samplemeanshelpedoceanographersdeducethepresenceoftheMediterranean tongue.Similarly,theessentialfeaturesofmanydatasetscanbecapturedina one-dimensionalorlow-dimensionalsummary.Suchasummaryiscalleda statistic .Theexamplesbelowrefertoadataset y 1 ;:::;y n ofsize n Denition2.1 Statistic A statistic isanyfunction,possiblyvectorvalued,ofthe data. Themostimportantstatisticsaremeasuresoflocationanddispersion.Importantexamplesoflocationstatisticsinclude mean Themeanofthedatais y n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 P y i R cancomputemeans: y<-1:10 meany median A median ofthedataisanynumber m suchthatatleasthalfofthe y i 'sare lessthanorequalto m andatleasthalfofthe y i 'saregreaterthanorequal to m .Wesayamedianinsteadofthemedianbecauseadatasetwithan evennumberofobservationshasanintervalofmedians.Forexample,if y <-1:10 ,thenevery m 2 [5 ; 6] isamedian.When R computesamedianit computesasinglenumberbytakingthemidpointoftheintervalofmedians. So mediany yields5.5. quantiles Forany p 2 [0 ; 1] ,the p -th quantile ofthedatashouldbe,roughlyspeaking,thenumber q suchthat pn ofthedatapointsarelessthan q and )]TJ/F41 11.9552 Tf 11.645 0 Td [(p n ofthedatapointsaregreaterthan q Figure2.1illustratestheidea.Panelashowsasampleof100pointsplotted asastripchartpage108.Theblackcirclesontheabcissaarethe.05,.5, and.9quantiles;so5pointsopencirclesaretotheleftoftherstvertical line,50pointsareoneithersideofthemiddleverticalline,and10pointsare totherightofthethirdverticalline.Panelbshowstheempiricalcdfofthe sample.Thevalues.05,.5,and.9areshownassquaresontheverticalaxis;

PAGE 109

2.2.DATADESCRIPTION 96 thequantilesarefoundbyfollowingthehorizontallinesfromthevertical axistothecdf,thentheverticallinesfromthecdftothehorizontalaxis. Panelscanddaresimilar,butshowthedistributionfromwhichthesample wasdrawninsteadofshowingthesampleitself.Inpanelc,5%ofthemass istotheleftoftherstblackcircle;50%isoneithersideofthemiddleblack circle;and10%istotherightofthethirdblackdot.Inpaneld,theopen squaresareat.05,.5,and.9ontheverticalaxis;thequantilesarethecircles onthehorizontalaxis. Denotethe p -thquantileas q p y 1 ;:::;y n ,orsimplyas q p ifthedatasetis clearfromthecontext.Withonlyanitesizedsample q p y 1 ;:::;y n cannot befoundexactly.Sothealgorithmforndingquantilesworksasfollows. 1.Sortthe y i 'sinascendingorder.Labelthem y ;:::;y n sothat y y n : 2.Set q 0 y and q 1 y n 3. y through y n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 determine n )]TJ/F15 11.9552 Tf 13.045 0 Td [(1 subintervalsin [ y ;y n ] .So,for i =1 ;:::;n )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 ,set q i n )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 y i +1 4.For p 2 i n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ; i +1 n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 let q p beanynumberintheinterval q i n )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ;q i +1 n )]TJ/F40 5.9776 Tf 5.757 0 Td [(1 If p isanicenumberthen q p isoftengivenaspecialname.Forexample, q : 5 isthemedian; q : 25 ;q : 5 ;q : 75 ,therst,secondandthirdquartiles,isavectorvaluedstatisticofdimension3; q : 1 ;q : 2 ;::: arethedeciles; q : 78 isthe78'th percentile. R cancomputequantiles.Whenfacedwith p 2 i n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ; i +1 n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 R doeslinearinterpolation.E.g. quantiley,c.25,.75 yields : 25 ; 7 : 75 Thevector y ;:::;y n denedinstep1ofthealgorithmforquantilesisan n-dimensionalstatisticcalledthe orderstatistic y i byitselfiscalledthe i 'th orderstatistic. Figure2.1wascreatedwiththefollowing R code. parmfrow=c,2 quant<-c.05,.5,.9 nquant<-lengthquant

PAGE 110

2.2.DATADESCRIPTION 97 Figure2.1:Quantiles.Theblackcirclesarethe.05,.5,and.9quantiles.Theopen squaresarethenumbers.05,.5,and.9ontheverticalaxis.Panelsaandbarefor asample;panelscanddareforadistribution.

PAGE 111

2.2.DATADESCRIPTION 98 y<-rgamma100,3 stripcharty,method="jitter",pch=1,xlim=c,10, xlab="y",main="a" ablinev=quantiley,quant pointsx=quantiley,quant,y=rep.5,nquant,pch=19 plot.ecdfy,xlab="y",ylab="Fy",xlim=c,10, main="b" forqinquant segmentsc,quantiley,q,cq,0, repquantiley,q,2,repq,2 pointsx=quantiley,quant,y=rep,nquant,pch=19 pointsx=rep,nquant,y=quant,pch=22 y<-seq,10,length=100 ploty,dgammay,3,type="l",xlim=c,10,ylab="py", main="c" pointsx=qgammaquant,3,y=rep,nquant,pch=19 ploty,pgammay,3,type="l",ylab="Fy",main="d" forqinquant segmentsc,qgammaq,3,cq,0,repqgammaq,3,2, repq,2 pointsx=qgammaquant,3,y=rep,nquant,pch=19 pointsx=rep,nquant,y=quant,pch=22 plot.ecdf plotstheempiricalcumulativedistributionfunction.Herethe wordempiricalmeansthatthecdfcomesfromasample,asopposedto theoreticalprobabilitycalculations. Dispersionstatisticsmeasurehowspreadoutthedataare.Sincetherearemany waystomeasuredispersiontherearemanydispersionstatistics.Importantdispersionstatisticsinclude standarddeviation ThesamplestandarddeviationorSDofadatasetis s r P y i )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 n

PAGE 112

2.2.DATADESCRIPTION 99 Note:somestatisticiansprefer s r P y i )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 forreasonswhichdonotconcernushere.If n islargethereislittledifference betweenthetwoversionsof s variance Thesamplevarianceis s 2 P y i )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 n Note:somestatisticiansprefer s 2 P y i )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 forreasonswhichdonotconcernushere.If n islargethereislittledifference betweenthetwoversionsof s 2 interquartilerange Theinterquartilerangeis q : 75 )]TJ/F41 11.9552 Tf 11.955 0 Td [(q : 25 Presentingalowdimensionalstatisticisusefulifwebelievethatthestatisticis representativeofthewholepopulation.Forinstance,inExample1.5,oceanographersbelievethedatatheyhavecollectedisrepresentativeofthelongtermstate oftheocean.ThereforethesamplemeansattheninelocationsinFigure1.12are representativeofthelongtermstateoftheoceanatthoselocations.Moreformally, foreachlocationwecanimagineapopulationoftemperatures,onetemperature foreachmomentintime.Thatpopulationhasanunknownpdf f .Eventhoughare dataarenotreallyarandomsamplefrom f Thesamplingtimeswerenotchosen randomly,amongotherproblems.wecanthinkofthemthatwaywithoutmaking tooseriousanerror.ThehistogramsinFigure1.12areestimatesofthe f 'sforthe ninelocations.Themeanofeach f iswhatoceanographerscalla climatological mean ,oranaveragewhich,becauseitistakenoveralongperiodoftime,representstheclimate.Theninesamplemeansareestimatesofthenineclimatological meantemperaturesatthoseninelocations.Simplypresentingthesamplemeans revealssomeinterestingstructureinthedata,andhenceaninterestingfacetof physicaloceanography. Often,morethanasimpledatadescriptionordisplayisnecessary;thestatisticianhastodoabitofexploringthedataset.Thisactivityiscalled exploratorydata

PAGE 113

2.2.DATADESCRIPTION 100 analysis orsimply eda .Itishardtogivegeneralrulesforeda,althoughdisplaying thedatainmanydifferentwaysisoftenagoodidea.Thestatisticianmustdecide whatdisplaysandedaareappropriateforeachdatasetandeachquestionthat mightbeansweredbythedataset.Thatisonethingthatmakesstatisticsinteresting.Itcannotbereducedtoasetofrulesandprocedures.Agoodstatistician mustbeattunedtothepotentiallyuniqueaspectsofeachanalysis.Wenowpresent severalexamplestoshowjustafewofthepossiblewaystoexploredatasetsby displayingthemgraphically.Theexamplesrevealsomeofthepowerofgraphical displayinilluminatingdataandteasingoutwhatithastosay. 2.2.2DisplayingDistributions Insteadofreducingadatasettojustafewsummarystatistics,itisoftenhelpfulto displaythefulldataset.Butreadingalonglistofnumbersisusuallynothelpful; humansarenotgoodatassimilatingdatainthatform.Wecanlearnalotmore fromagraphicalrepresentationofthedata. Histograms Thenextexamplesusehistogramstodisplaythefulldistributionof somedatasets.Visualcomparisonofthehistogramsrevealsstructureinthedata. Example2.1 ToothGrowth The R statisticallanguagecomeswithmanydatasets.Type data toseewhatthey are.Thisexampleusesthedataset ToothGrowth ontheeectofvitaminContooth growthinguineapigs.Youcangetadescriptionbytyping helpToothGrowth .You canloadthedatasetintoyour R sessionbytyping dataToothGrowth ToothGrowth isa dataframe ofthreecolumns.Therstfewrowslooklikethis: lensuppdose 14.2VC0.5 211.5VC0.5 37.3VC0.5 Column1,or len ,recordstheamountoftoothgrowth.Column2, supp ,recordswhether theguineapigwasgivenvitaminCinascorbicacidororangejuice.Column3, dose recordsthedose,either0.5,1.0or2.0mg.Thustherearesixgroupsofguineapigsin atwobythreelayout.Eachgrouphastenguineapigs,foratotalofsixtyobservations. Figure2.2showshistogramsofgrowthforeachofthesixgroups.FromFigure2.2itis clearthatdoseaectstoothgrowth.

PAGE 114

2.2.DATADESCRIPTION 101 Figure2.2:HistogramsoftoothgrowthbydeliverymethodVCorOJanddose .5,1.0or2.0.

PAGE 115

2.2.DATADESCRIPTION 102 Figure2.3:HistogramsoftoothgrowthbydeliverymethodVCorOJanddose .5,1.0or2.0.

PAGE 116

2.2.DATADESCRIPTION 103 Figure2.4:HistogramsoftoothgrowthbydeliverymethodVCorOJanddose .5,1.0or2.0.

PAGE 117

2.2.DATADESCRIPTION 104 Figure2.2wasproducedbythefollowing R code. supp<-uniqueToothGrowth$supp dose<-uniqueToothGrowth$dose parmfcol=c,2 foriin1:2 forjin1:3{ good<-ToothGrowth$supp==supp[i] &ToothGrowth$dose==dose[j] histToothGrowth$len[good],breaks=seq,34,by=2, xlab="",ylab="", main=pastesupp[i],",",dose[j],sep="" } uniquex returnstheuniquevaluesin x .Forexample,if x<-c,1,2 then uniquex wouldbe 12 Figure2.3issimilartoFigure2.2butlaidoutintheotherdirection.Noticethatit's easiertocomparehistogramswhentheyarearrangedverticallyratherthanhorizontally. Theguressuggestthatdeliverymethoddoeshaveaneect,butnotasstrongas thedoseeect.NoticealsothatFigure2.3ismorediculttoreadthanFigure2.2 becausethehistogramsaretootallandnarrow.Figure2.4repeatsFigure2.3butusing lessverticaldistance;itisthereforeeasiertoread.Partofgoodstatisticalpracticeis displayingguresinawaythatmakesthemeasiesttoreadandinterpret. Theguresalonehavesuggestedthatdoseisthemostimportanteect,anddelivery methodlessso.Afurtheranalysiscouldtrytobemorequantitative:whatisthetypical sizeofeacheect,howsurecanwebeofthetypicalsize,andhowmuchdoestheeect varyfromanimaltoanimal.Theguresalreadysuggestanswers,butamoreformal analysisisdeferredtoSection2.7. Figures1.12,2.2,and2.3are histograms .Theabscissahasthesamescaleas thedata.Thedataaredividedintobins.Theordinateshowsthenumberofdata pointsineachbin. hist...,prob=T plotstheordinateasprobabilityrather thancounts.Histogramsareapowerfulwaytodisplaydatabecausetheygivea strongvisualimpressionofthemainfeaturesofadataset.However,detailsofthe histogramcandependonboththenumberofbinsandonthecutpointsbetween bins.Forthatreasonitissometimesbettertouseadisplaythatdoesnotdepend

PAGE 118

2.2.DATADESCRIPTION 105 onthosefeatures,oratleastnotsostrongly.Example2.2illustrates. DensityEstimation Example2.2 HotDogs InJuneof1986,ConsumerReportspublishedastudyofhotdogs.Thedataareavailable at DASL ,the DataandStoryLibrary ,acollectionofdatasetsforfreeusebystatistics students. DASL saysthedataare Resultsofalaboratoryanalysisofcaloriesandsodiumcontentofmajorhot dogbrands.ResearchersforConsumerReportsanalyzedthreetypesofhot dog:beef,poultry,andmeatmostlyporkandbeef,butupto15%poultry meat. Youcandownloadthedatafrom http://lib.stat.cmu.edu/DASL/Datafiles/Hotdogs. html .Therstfewlineslooklikethis: TypeCaloriesSodium Beef186495 Beef181477 Thisexamplelooksatthecaloriecontentofbeefhotdogs.Laterexampleswillcompare thecaloriecontentsofdierenttypesofhotdogs. Figure2.5aisahistogramofthecaloriecontentsofbeefhotdogsinthestudy. Fromthehistogramonemightformtheimpressionthattherearetwomajorvarietiesof beefhotdogs,onewithabout130caloriesorso,anotherwithabout180caloriesor so,andarareoutlierwithfewercalories.Figure2.5bisanotherhistogramofthesame databutwithadierentbinwidth.Itgivesadierentimpression,thatcaloriecontent isevenlydistributed,approximately,fromabout130toabout190withasmallnumber oflowercaloriehotdogs.Figure2.5cgivesmuchthesameimpressionas2.5b.It wasmadewiththesamebinwidthas2.5a,butwithcutpointsstartingat105instead of110.Thesehistogramsillustratethatone'simpressioncanbeinuencedbybothbin widthandcutpoints. Densityestimation isamethodofreducingdependenceoncutpoints.Let x 1 ,..., x 20 bethecaloriecontentsofbeefhotdogsinthestudy.Wethinkof x 1 ;:::;x 20 asa randomsamplefromadensity f representingthepopulationofallbeefhotdogs.Our goalistoestimate f .Foranyxednumber x ,howshallweestimate f x ?Theideais touseinformationlocalto x toestimate f x .Werstdescribeabasicversion,then addtworenementstoget kerneldensityestimation andthe density functionin R .

PAGE 119

2.2.DATADESCRIPTION 106 Let n bethesamplesizeforthehotdogdata.Beginbychoosinganumber h> 0 .Foranynumber x theestimate ^ f basic x isdenedtobe ^ f basic x 1 2 nh n X i =1 1 x )]TJ/F42 7.9701 Tf 6.586 0 Td [(h;x + h x i = fractionofsamplepointswithin h of x 2 h ^ f basic hasatleasttwoapparentlyundesirablefeatures. 1. ^ f basic x givesequalweighttoalldatapointsintheinterval x )]TJ/F41 11.9552 Tf 12.504 0 Td [(h;x + h and hasabruptcutosattheendsoftheinterval.Itwouldbebettertogivethemost weighttodatapointsclosestto x andhavetheweightsdecreasegraduallyfor pointsincreasinglyfurtherawayfrom x 2. ^ f basic x dependscriticallyonthechoiceof h Wedealwiththeseproblemsbyintroducingaweightfunctionthatdependsondistance from x .Let g 0 beaprobabilitydensityfunction.Usually g 0 ischosentobesymmetric andunimodal,centeredat 0 .Dene ^ f x 1 n X g 0 x )]TJ/F41 11.9552 Tf 11.955 0 Td [(x i Choosing g 0 tobeaprobabilitydensityensuresthat ^ f isalsoaprobabilitydensitybecause Z 1 ^ f x dx = 1 n X i Z 1 g 0 x )]TJ/F41 11.9552 Tf 11.955 0 Td [(x i dx =1 .1 When g 0 ischosentobeacontinuousfunctionitdealsnicelywithproblem1above.In fact, ^ f basic comesfromtaking g 0 tobetheuniformdensityon )]TJ/F41 11.9552 Tf 9.298 0 Td [(h;h Todealwithproblem2werescale g 0 .Chooseanumber h> 0 anddeneanew density g x = h )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 g 0 x=h .Alittlethoughtshowsthat g diersfrom g 0 byarescaling ofthehorizontalaxis;thefactor h )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 compensatestomake R g =1 .Nowdenethe densityestimatetobe ^ f h x 1 n X g x )]TJ/F41 11.9552 Tf 11.956 0 Td [(x i = 1 nh X g 0 x )]TJ/F41 11.9552 Tf 11.956 0 Td [(x i =h h iscalledthe bandwidth .Ofcourse ^ f h stilldependson h .Itturnsoutthatdependence onbandwidthisnotreallyaproblem.Itisusefultoviewdensityestimatesforseveral dierentbandwidths.Eachrevealsfeaturesof f atdierentscales.Figures2.5d,e, andfareexamples.Paneldwasproducedbythedefaultbandwidth;panelse andfwereproducedwith1/4and1/2thedefaultbandwidth.Largerbandwidth makesasmootherestimateof f ;smallerbandwidthmakesitrougher.Noneisexactly right.Itisusefultolookatseveral.

PAGE 120

2.2.DATADESCRIPTION 107 Figure2.5: a b c :histogramsofcaloriecontentsofbeefhotdogs; d e f :densityestimatesofcaloriecontentsofbeefhotdogs.

PAGE 121

2.2.DATADESCRIPTION 108 Figure2.5wasproducedwith hotdogs<-read.table"data/hotdogs/data",header=T cal.beef<-hotdogs$Calories[hotdogs$Type=="Beef"] parmfrow=c,2 histcal.beef,main="a",xlab="calories",ylab="" histcal.beef,breaks=seq,190,by=20,main="b", xlab="calories",ylab="" histcal.beef,breaks=seq,195,by=10,main="c", xlab="calories",ylab="" plotdensitycal.beef,main="d",xlab="calories", ylab="density" plotdensitycal.beef,adjust=1/4,main="e", xlab="calories",ylab="density" plotdensitycal.beef,adjust=1/2,main="f", xlab="calories",ylab="density" Inpanela R useditsdefaultmethodforchoosinghistogrambins. Inpanelsbandcthehistogrambinsweresetby hist...,breaks=seq... density producesakerneldensityestimate. R usesaGaussiankernelbydefaultwhichmeansthat g 0 aboveistheN ; 1 density. Inpaneld R useditsdefaultmethodforchoosingbandwidth. Inpanelseandfthebandwidthwassetto1/4and1/2thedefaultby density...,adjust=... StripchartsandDotplots Figure2.6usesthe ToothGrowth datatoillustrate stripcharts ,alsocalled dotplots ,analternativetohistograms.Panel a hasthree rowsofpointscorrespondingtothethreedosesofascorbicacid.Eachpointisfor oneanimal.Theabscissashowstheamountoftoothgrowth;theordinateshows

PAGE 122

2.2.DATADESCRIPTION 109 thedose.Thepanelisslightlymisleadingbecausepointswithidenticalcoordinates areplotteddirectlyontopofeachother.Insuchsituationsstatisticiansoftenadd asmallamountofjittertothedata,toavoidoverplotting.Themiddlepanelisa repeatofthetop,butwithjitteradded.Thebottompanelshowstoothgrowthby deliverymethod.CompareFigure2.6toFigures2.2and2.3.Whichisabetter displayforthisparticulardataset? Figure2.6wasproducedwiththefollowing R code. parmfrow=c,1 stripchartToothGrowth$len~ToothGrowth$dose,pch=1, main="a",xlab="growth",ylab="dose" stripchartToothGrowth$len~ToothGrowth$dose, method="jitter",main="b",xlab="growth", ylab="dose",pch=1 stripchartToothGrowth$len~ToothGrowth$supp, method="jitter",main="c",xlab="growth", ylab="method",pch=1 Boxplots Analternative,usefulforcomparingmanydistributionssimultaneously, isthe boxplot .Example2.3usesboxplotstocomparescoreson24quizzesina statisticscourse. Example2.3 QuizScores Inthespringsemesterof2003,58studentscompletedStatistics103atDukeUniversity. Figure2.7displaystheirgrades. Therewere24quizzesduringthesemester.Eachwasworth10points.Theupper panelofthegureshowsthedistributionofscoresoneachquiz.Theabscissaislabelled 1through24,indicatingthequiznumber.Foreachquiz,thegureshowsa boxplot .For eachquizthereisabox.Thehorizontallinethroughthecenteroftheboxisthemedian gradeforthatquiz.WecanseethatthemedianscoreonQuiz2isaround7,whilethe medianscoreonQuiz3isaround4.Theupperendoftheboxisthe75thpercentilerd quartileofscores;thelowerendoftheboxisthe25thpercentilestquartile.Wecan seethatabouthalfthestudentsscoredbetweenabout5and8onQuiz2,whileabout halfthestudentsscoredbetweenabout2and6onQuiz3.Quiz3wastough. Eachboxmayhave whiskers ,ordashedlineswhichextendaboveandbelowthebox. Theexactdenitionofthewhiskersisnotimportant,buttheyaremeanttoincludemost

PAGE 123

2.2.DATADESCRIPTION 110 Figure2.6:aToothgrowthbydose,nojittering;bToothgrowthbydosewith jittering;cToothgrowthbydeliverymethodwithjittering

PAGE 124

2.2.DATADESCRIPTION 111 ofthedatapointsthatdon'tfallinsidethebox.In R ,bydefault,thewhiskersextendto themostextremedatapointwhichisnomorethat1.5timestheinterquartilerangeaway fromthemedian.Finally,theremaybesomeindividualpointsplottedaboveorbelow eachboxplot.Theseindicate outliers ,orscoresthatareextremelyhighorlowrelativeto otherscoresonthatquiz.Manyquizzeshadlowoutliers;onlyQuiz5hadahighoutlier. Boxplotsareextremelyusefulforcomparingmanysetsofdata.Wecaneasilysee, forexample,thatQuiz5wasthemostdicult5%oftheclassscored3orless.while Quiz1wastheeasiestover75%oftheclassscored10. Therewerenoexamsorgradedhomeworks.Students'gradesweredeterminedby theirbest20quizzes.Tocomputegrades,eachstudent'sscoresweresorted,therst4 weredropped,thentheotherswereaveraged.Thoseaveragesaredisplayedinastripchart inthebottompanelofthegure.It'seasytoseethatmostoftheclasshadquizaverages betweenabout5and9butthat4averagesweremuchlower. Figure2.7wasproducedbythefollowing R code. ...#readinthedata colnamesscores<-paste"Q",1:24,sep="" #definecolumnnames boxplotdata.framescores,main="Individualquizzes" scores[is.nascores]<-0#replacemissingscores #with0's temp<-applyscores,1,sort#sort temp<-temp[5:24,]#dropthe4lowest scores.ave<-applytemp,2,mean#findtheaverage stripchartscores.ave,"jitter",pch=1,xlab="score", xlim=c,10,main="Studentaverages" QQplots Sometimeswewanttoassesswhetheradatasetiswellmodelledbya Normaldistributionand,ifnot,howitdiffersfromNormal.Oneobviouswayto assessNormalityisbylookingathistogramsordensityestimates.Buttheanswer isoftennotobviousfromthegure.AbetterwaytoassessNormalityiswith QQplots .Figure2.8illustratesfortheninehistogramsofoceantemperaturesin Figure1.12.

PAGE 125

2.2.DATADESCRIPTION 112 Figure2.7:QuizscoresfromStatistics103

PAGE 126

2.2.DATADESCRIPTION 113 EachpanelinFigure2.8wascreatedwiththeoceantemperaturesnearaparticularlatitude,longitudecombination.Consider,forexample,theupperleft panelwhichwasconstructedfromthe n =213 points x 1 ;:::;x 213 takennear, -40.Thosepointsaresorted,fromsmallesttolargest,tocreatetheorderstatistic x ;:::;x .Thentheyareplottedagainst E [ z ;:::;z ] ,theexpected orderstatisticfromaNormaldistribution.Ifthe x i sareapproximatelyNormal thentheQQplotwilllookapproximatelylinear.Theslopeofthelineindicatesthe standarddeviation. InFigure2.8mostofthepanelsdolookapproximatelylinear,indicatingthata Normalmodelisreasonable.ButsomeofthepanelsshowdeparturesfromNormality.Intheupperleftandlowerleftpanels,forexample,theplotslooksroughlylinearexceptfortheupperrightcornerswhichshowsomedatapointsmuchwarmer thanexpectediftheyfollowedaNormaldistribution.Incontrast,thecoolesttemperaturesinthelowermiddlepanelarenotquiteascoolasexpectedfromaNormal distribution. Figure2.8wasproducedwith lats<-c45,35,25 lons<-c-40,-30,-20 parmfrow=c,3 foriin1:3 forjin1:3{ good<-absmed.1000$lon-lons[j]<1& absmed.1000$lat-lats[i]<1 qqnormmed.1000$temp[good],xlab="",ylab="", sub=paste"n=",sumgood,sep="", main=paste"latitude=",lats[i],"nlongitude=", lons[j] } 2.2.3ExploringRelationships Sometimesitistherelationshipsbetweenseveralrandomvariablesthatareof interest.Forexample,indiscriminationcasesthefocusisontherelationshipbetweenraceorgenderononehandandemploymentorsalaryontheotherhand. Subsection2.2.3showsseveralgraphicalwaystodisplayrelationships.

PAGE 127

2.2.DATADESCRIPTION 114 Figure2.8:QQplotsofwatertemperatures Cat1000mdepth

PAGE 128

2.2.DATADESCRIPTION 115 WebeginwithExample2.4,ananalysisofpotentialdiscriminationinadmission toUCBerkeleygraduateschool. Example2.4 In1973UCBerkeleyinvestigateditsgraduateadmissionsratesforpotentialsexbias.Apparentlywomenweremorelikelytoberejectedthanmen.Thedataset UCBAdmissions givestheacceptanceandrejectiondatafromthesixlargestgraduatedepartmentson whichthestudywasbased.Typing helpUCBAdmissions tellsmoreaboutthedata. Ittellsus,amongotherthings: ... Format: A3-dimensionalarrayresultingfromcross-tabulating4526 observationson3variables.Thevariablesandtheirlevels areasfollows: NoNameLevels 1AdmitAdmitted,Rejected 2GenderMale,Female 3DeptA,B,C,D,E,F ... Themajorquestionatissueiswhetherthereissexbiasinadmissions.Toinvestigatewe askwhethermenandwomenareadmittedatroughlyequalrates. Typing UCBAdmissions givesthefollowingnumericalsummaryofthedata. ,,Dept=A Gender AdmitMaleFemale Admitted51289 Rejected31319 ,,Dept=B Gender AdmitMaleFemale Admitted35317 Rejected2078

PAGE 129

2.2.DATADESCRIPTION 116 ,,Dept=C Gender AdmitMaleFemale Admitted120202 Rejected205391 ,,Dept=D Gender AdmitMaleFemale Admitted138131 Rejected279244 ,,Dept=E Gender AdmitMaleFemale Admitted5394 Rejected138299 ,,Dept=F Gender AdmitMaleFemale Admitted2224 Rejected351317 Foreachdepartment,thetwowaytableofadmissionstatusversussexisdisplayed. Suchadisplay,calleda crosstabulation ,simplytabulatesthenumberofentriesineach cellofamultiwaytable.It'shardtotellfromthecrosstabulationwhetherthereisa sexbiasand,ifso,whetheritissystemicorconnedtojustafewdepartments.Let's continuebyndingthemarginalaggregatedbydepartmentasopposedtoconditional givendeparmentadmissionsratesformenandwomen. >applyUCBAdmissions,c,2,sum Gender AdmitMaleFemale

PAGE 130

2.2.DATADESCRIPTION 117 Admitted1198557 Rejected14931278 Theadmissionrateformenis 1198 = +1493=44 : 5% whiletheadmissionratefor womenis 557 = +1493=30 : 4% ,muchlower.A mosaicplot ,createdwith mosaicplotapplyUCBAdmissions,c,2,sum, main="StudentadmissionsatUCBerkeley" isagraphicalwaytodisplaythediscrepancy.Abeautifulexampleofamosaicplotison thecoverof CHANCE magazine.refhere.Theleftcolumnisforadmittedstudents;the heightsoftherectanglesshowhowmanyadmittedstudentsweremaleandhowmany werefemale.Therightcolumnisforrejectedstudents;theheightsoftherectangles showhowmanyweremaleandfemale.Ifsexandadmissionstatuswereindependent, i.e.,iftherewerenosexbias,thentheproportionofmenamongadmittedstudents wouldequaltheproportionofmenamongrejectedstudentsandtheheightsoftheleft rectangleswouldequaltheheightsoftherightrectangles.Theapparentdierencein heightsisavisualrepresentationofthediscrepancyinsexratiosamongadmittedand rejectedstudents.Thesamedatacanbeviewedasdiscrepantadmissionratesformen andwomenbytransposingthematrix: mosaicplottapplyUCBAdmissions,c,2,sum, main="StudentadmissionsatUCBerkeley" Theexistenceofdiscrepantsexratiosforadmittedandrejectedstudentsisequivalent totheexistenceofdiscrepantadmissionratesformalesandfemalesandtodependence ofsexandadmissionrates.Thelackofdiscrepantratiosisequivalenttoindependence ofsexandadmissionrates. EvidentlyUCBerkeleyadmittedmenandwomenatdierentrates.Butgraduate admissiondecisionsarenotmadebyacentraladmissionsoce;theyaremadebythe individualdepartmentstowhichstudentsapply.Soournextstepistolookatadmission ratesforeachdepartmentseparately.Wecanlookatthecrosstabulationonpage115or makemosaicplotsforeachdepartmentseparatelynotshownherewith ##Mosaicplotsforindividualdepartments foriin1:6 mosaicplotUCBAdmissions[,,i], xlab="Admit",ylab="Sex", main=paste"Department",LETTERS[i]

PAGE 131

2.2.DATADESCRIPTION 118 Figure2.9:Mosaicplotof UCBAdmissions

PAGE 132

2.2.DATADESCRIPTION 119 Figure2.10:Mosaicplotof UCBAdmissions

PAGE 133

2.2.DATADESCRIPTION 120 Theplotsshowthatineachdepartmentmenandwomenareadmittedatroughly equalrates.Thefollowingsnippetcalculatesandprintstherates.Itconrmstherough equalityexceptfordepartmentAwhichadmittedwomenatahigherratethanmen. foriin1:6{#foreachdepartment temp<-UCBAdmissions[,,i]#thatdepartment'sdata m<-temp[1,1]/temp[1,1]+temp[2,1]#Men'sadmissionrate w<-temp[1,2]/temp[1,2]+temp[2,2]#Women'sadmissionrate printcm,w#printthem } NotethatdepartmentsAandBwhichhadhighadmissionratesalsohadlargenumbers ofmaleapplicantswhiledepartmentsC,D,EandFwhichhadlowadmissionrateshad largenumbersoffemaleapplicants.Thegenerallyacceptedexplanationforthediscrepant marginaladmissionratesisthatmentendedtoapplytodepartmentsthatwereeasyto getintowhilewomentendedtoapplytodepartmentsthatwerehardertogetinto.A moresinisterexplanationisthattheuniversitygavemoreresourcestodepartmentswith manymaleapplicants,allowingthemtoadmitagreaterproportionoftheirapplicants. Thedatawe'veanalyzedareconsistentwithbothexplanations;thechoicebetweenthem mustbemadeonothergrounds. Onelessonhereforstatisticiansisthepowerofsimpledatadisplaysandsummaries. Anotheristheneedtoconsidertheuniqueaspectsofeachdataset.Theexplanation ofdierentadmissionsratesformenandwomencouldonlybediscoveredbysomeone familiarwithhowuniversitiesandgraduateschoolswork,notbyfollowingsomegeneral rulesabouthowtodostatisticalanalyses. Thenextexampleisaboutthedurationoferuptionsandintervaltothenext eruptionoftheOldFaithfulgeyser.Itexplorestwokindsofrelationshipstherelationshipbetweendurationanderuptionandalsotherelationshipofeachvariable withtime. Example2.5 OldFaithful OldFaithful isageyserinYellowstoneNationalParkandagreattouristattraction.As DenbyandPregibon[1987]explain,FromAugust1toAugust8,1978,rangersand naturalistsatYellowstoneNationalParkrecordedthe duration oferuptionand interval tothenexteruptionbothinminutesforeruptionsofOldFaithfulbetween6a.m. andmidnight.Theintentofthestudywastopredictthetimeofthenexteruption, tobepostedattheVisitor'sCentersothatvisitorstoYellowstonecanusefullybudget theirtime.The R dataset faithful containsthedata.Inadditiontothereferences listedthere,thedataandanalysescanalsobefoundinWeisberg[1985]andDenbyand

PAGE 134

2.2.DATADESCRIPTION 121 Pregibon[1987].Thelatteranalysisemphasizesgraphics,andweshallfollowsomeof theirsuggestionshere. Webeginexploringthedatawithstripchartsanddensityestimatesofdurationsand intervals.TheseareshowninFigure2.11.Theguresuggestsbimodaldistributions.For duration thereseemstobeonebunchofdataaroundtwominutesandanotheraround fourorveminutes.For interval ,themodesarearound50minutesand80minutes.A plotofintervalversusduration,Figure2.12,suggeststhatthebimodalityispresentinthe jointdistributionofthetwovariables.Becausethedatawerecollectedovertime,itmight beusefultoplotthedataintheorderofcollection.That'sFigure2.13.Thehorizontal scaleinFigure2.13issocompressedthatit'shardtoseewhat'sgoingon.Figure2.14 repeatsFigure2.13butdividesthetimeintervalintotwosubintervalstomaketheplots easiertoread.Thesubintervalsoverlapslightly.Thepersistentup-and-downcharacter ofFigure2.14showsthat,forthemostpart,longandshortdurationsareinterwoven, asarelongandshortintervals.Figure2.14ispotentiallymisleading.Thedatawere collectedoveraneightdayperiod.Thereareeightseparatesequencesoferuptionswith gapsinbetween.The faithful datasetdoesnottelluswherethegapsare.Denbyand Pregibon[1987]telluswherethegapsareandusetheeightseparatedaystonderrors indatatranscription.Justthissimpleanalysis,acollectionoffourgures,hasgivenus insightintothedatathatwillbeveryusefulinpredictingthetimeofthenexteruption. Figures2.11,2.12,2.13,and2.14wereproducedwiththefollowing R code. datafaithful attachfaithful parmfcol=c,2 stripcharteruptions,method="jitter",pch=1,xlim=c,6, xlab="durationmin",main="a" plotdensityeruptions,type="l",xlim=c,6, xlab="durationmin",main="b" stripchartwaiting,method="jitter",pch=1,xlim=c,100, xlab="waitingmin",main="c" plotdensitywaiting,type="l",xlim=c0,100, xlab="waitingmin",main="d" parmfrow=c,1 ploteruptions,waiting,xlab="durationoferuption", ylab="timetonexteruption"

PAGE 135

2.2.DATADESCRIPTION 122 Figure2.11:OldFaithfuldata:durationoferuptionsandwaitingtimebetween eruptions.Stripcharts:aandc.Densityestimates:bandd.

PAGE 136

2.2.DATADESCRIPTION 123 Figure2.12:WaitingtimeversusdurationintheOldFaithfuldataset

PAGE 137

2.2.DATADESCRIPTION 124 Figure2.13:a:durationandb:waitingtimeplottedagainstdatanumberin theOldFaithfuldataset

PAGE 138

2.2.DATADESCRIPTION 125 Figure2.14:a1,a2:durationandb1,b2:waitingtimeplottedagainstdata numberintheOldFaithfuldataset

PAGE 139

2.2.DATADESCRIPTION 126 parmfrow=c,1 plot.tseruptions,xlab="datanumber",ylab="duration", main="a" plot.tswaiting,xlab="datanumber",ylab="waitingtime", main="b" parmfrow=c,1 plot.tseruptions[1:150],xlab="datanumber", ylab="duration",main="a1" plot.tseruptions[130:272],xlab="datanumber", ylab="duration",main="a2" plot.tswaiting[1:150],xlab="datanumber", ylab="waitingtime",main="b1" plot.tswaiting[130:272],xlab="datanumber", ylab="waitingtime",main="b2" Figures2.15and2.16introduce coplots ,atoolforvisualizingtherelationship amongthreevariables.TheyrepresenttheoceantemperaturedatafromExample1.5.InFigure2.15therearesixpanelsinwhichtemperatureisplottedagainst latitude.Eachpanelismadefromthepointsinarestrictedrangeoflongitude.The upperpanel,theonespanningthetopoftheFigure,showsthesixdifferentranges oflongitude.Forexample,therstlongituderangerunsfromabout-10toabout -17.Pointswhoselongitudeisintheinterval )]TJ/F15 11.9552 Tf 9.298 0 Td [(17 ; )]TJ/F15 11.9552 Tf 9.298 0 Td [(10 gointotheupperright panelofscatterplots.ThesearethepointsveryclosetothemouthoftheMediterraneanSea.Lookingatthatpanelweseethattemperatureincreasesverysteeply fromSouthtoNorth,untilabout 35 ,atwhichpointtheystarttodecreasesteeply aswegofurtherNorth.That'sbecausewe'recrossingtheMediterraneantongueat apointveryclosetoitssource. Theotherlongituderangesareabout )]TJ/F15 11.9552 Tf 9.298 0 Td [(20 ; )]TJ/F15 11.9552 Tf 9.299 0 Td [(13 )]TJ/F15 11.9552 Tf 9.298 0 Td [(25 ; )]TJ/F15 11.9552 Tf 9.299 0 Td [(16 )]TJ/F15 11.9552 Tf 9.298 0 Td [(30 ; )]TJ/F15 11.9552 Tf 9.299 0 Td [(20 )]TJ/F15 11.9552 Tf 9.298 0 Td [(34 ; )]TJ/F15 11.9552 Tf 9.299 0 Td [(25 and )]TJ/F15 11.9552 Tf 9.299 0 Td [(40 ; )]TJ/F15 11.9552 Tf 9.299 0 Td [(28 .Theyareusedtocreatethescatterplotpanelsintheuppercenter,upperleft,lowerright,lowercenter,andlowerleft,respectively.Thegeneral impressionis temperaturesdecreaseslightlyaswemoveEasttoWest, theangleinthescatterplotbecomesslightlyshalloweraswemoveEastto West,and

PAGE 140

2.2.DATADESCRIPTION 127 therearesomepointsthatdon'ttthegeneralpattern. Noticethatthelongituderangesareoverlappingandnotofequalwidth.The rangesarechosenby R tohavealittlebitofoverlapandtoputroughlyequal numbersofpointsintoeachrange. Figure2.16reversestherolesoflatitudeandlongitude.Theimpressionisthat temperatureincreasesgraduallyfromWesttoEast.Thesetwoguresgiveafairly clearpictureoftheMediterraneantongue. Figures2.15and2.16wereproducedby coplottemp~lat|lon coplottemp~lon|lat Example2.6showsonewaytodisplaytherelationshipbetweentwosequences ofevents. Example2.6 Neurobiology Tolearnhowthebrainworks,neurobiologistsimplantelectrodesintoanimalbrains.These electrodesareneenoughtorecordtheringtimesofindividualneurons.Asequence ofringtimesofaneuroniscalleda spiketrain .Figure2.17showsthespiketrainfrom oneneuroninthegustatorycortexofaratwhiletheratwasinanexperimentontaste. Thisparticularratwasintheexperimentforalittleover80minutes.Thoseminutesare markedonthe y -axis.The x -axisismarkedinseconds.Eachdotontheplotshowsa timeatwhichtheneuronred.Wecansee,forexample,thatthisneuronredabout ninetimesintherstveseconds,thenwassilentforaboutthenexttenseconds.We canalsosee,forexample,thatthisneuronundergoessomeepisodesofveryrapidring lastinguptoabout10seconds. Sincethisneuronisinthegustatorycortexthepartofthebrainresponsiblefor tasteitisofinteresttoseehowtheneuronrespondstovarioustastes.Duringthe experimenttheratwaslickingatubethatsometimesdeliveredadropofwaterand sometimesdeliveredadropofwaterinwhichachemical,or tastant ,wasdissolved.The 55shortverticallinesontheplotshowthetimesatwhichtheratreceivedadropof300 millimolar.3MsolutionofNaCl.Wecanexaminetheplotforrelationshipsbetween deliveriesofNaClandactivityoftheneuron.

PAGE 141

2.2.DATADESCRIPTION 128 Figure2.15:Temperatureversuslatitudefordifferentvaluesoflongitude

PAGE 142

2.2.DATADESCRIPTION 129 Figure2.16:Temperatureversuslongitudefordifferentvaluesoflatitude

PAGE 143

2.2.DATADESCRIPTION 130 Figure2.17:Spiketrainfromaneuronduringatasteexperiment.Thedotsshow thetimesatwhichtheneuronred.Thesolidlinesshowtimesatwhichtherat receivedadropofa.3MsolutionofNaCl.

PAGE 144

2.2.DATADESCRIPTION 131 Figure2.17wasproducedby datadir<-"~/research/neuro/data/stapleton/" spikes<-list sig002a=scanpastedatadir,"sig002a.txt",sep="", sig002b=scanpastedatadir,"sig002b.txt",sep="", sig002c=scanpastedatadir,"sig002c.txt",sep="", sig003a=scanpastedatadir,"sig003a.txt",sep="", sig003b=scanpastedatadir,"sig003b.txt",sep="", sig004a=scanpastedatadir,"sig004a.txt",sep="", sig008a=scanpastedatadir,"sig008a.txt",sep="", sig014a=scanpastedatadir,"sig014a.txt",sep="", sig014b=scanpastedatadir,"sig014b.txt",sep="", sig017a=scanpastedatadir,"sig017a.txt",sep="" tastants<-list MSG100=scanpastedatadir,"MSG100.txt",sep="", MSG300=scanpastedatadir,"MSG300.txt",sep="", NaCl100=scanpastedatadir,"NaCl100.txt",sep="", NaCl300=scanpastedatadir,"NaCl300.txt",sep="", water=scanpastedatadir,"water.txt",sep="" stripchartspikes[[8]]%%60~spikes[[8]]%/%60,pch=".", main="aspiketrain",xlab="seconds",ylab="minutes" pointstastants$NaCl300%%60,tastants$NaCl300%/%60+1, pch="|" Theline datadir<-... storesthenameofthedirectoryinwhichIkeep theneurodata.Whenusedin paste itidentiesindividualles. Thecommand list createsalist.Theelementsofalistcanbeanything.In thiscasethelistnamed spikes hastenelementswhosenamesare sig002a sig002b ,...,and sig017a .Thelistnamed tastants hasveelementswhose namesare MSG100 MSG300 NaCl100 NaCl300 ,and water .Listsareusefulfor keepingrelatedobjectstogether,esepeciallywhenthoseobjectsaren'tallof thesametype. Eachelementofthelististheresultofa scan scan readsaleand storestheresultinavector.So spikes isalistoftenvectors.Eachvector

PAGE 145

2.3.LIKELIHOOD 132 containstheringtimes,oraspiketrain,ofoneneuron. tastants isalistof vevectors.Eachvectorcontainsthetimesatwhichaparticulartastantwas delivered. Therearetwowaystoreferanelementofalist.Forexample, spikes[[8]] referstotheeighthelementof spikes while tastants$NaCl300 referstothe elementnamed NaCl300 Listsareusefulforkeepingrelatedobjectstogether,especiallywhenthose objectsarenotthesametype.Inthisexample spikes$sig002a isavector whoselengthisthenumberoftimesneuron002ared,whilethelength of spikes$sig002b isthenumberoftimesneuron002bred.Sincethose lengthsarenotthesame,thedatadon'ttneatlyintoamatrix,soweusea listinstead. 2.3Likelihood 2.3.1TheLikelihoodFunction Itoftenhappensthatweobservedatafromadistributionthatisnotknownpreciselybutwhosegeneralformisknown.Forexample,wemayknowthatthedata comefromaPoissondistribution, X Poi ,butwedon'tknowthevalueof Wemayknowthat X Bin n; butnotknow .Orwemayknowthatthevalues of X aredenselyclusteredaroundsomecentralvalueandsparseronbothsides,so wedecidetomodel X N ; ,butwedon'tknowthevaluesof and .Inthese casesthereisawholefamilyofprobabilitydistributionsindexedbyeither ,or ; .Wecall ,or ; theunknown parameter ;thefamilyofdistributions iscalleda parametricfamily .Often,thegoalofthestatisticalanalysisistolearn aboutthevalueoftheunknownparameter.Ofcourse,learningwhichvalueofthe parameteristhetrueone,orwhichvaluesoftheparameterareplausibleinlight ofthedata,isthesameaslearningwhichmemberofthefamilyisthetrueone,or whichmembersofthefamilyareplausibleinlightofthedata. Thedifferentvaluesoftheparameter,orthedifferentmembersofthefamily, representdifferenttheoriesorhypothesesaboutnature.Asensiblewaytodiscriminateamongthetheoriesisaccordingtohowwelltheyexplainthedata.Recall theSeedlingsdataExamples1.4,1.6,1.7and1.9inwhich X wasthenumber ofnewseedlingsinaforestquadrat, X Poi ,anddifferentvaluesof repre-

PAGE 146

2.3.LIKELIHOOD 133 sentdifferenttheoriesorhypothesesaboutthearrivalrateofnewseedlings.When X turnedouttobe3,howwellavalueof explainsthedataismeasuredby Pr[ X =3 j ] .Thisprobability,asafunctionof ,iscalledthe likelihoodfunction anddenoted ` .Itsayshowwelleachvalueof explainsthedatum X =3 Figure1.6pg.19isaplotofthelikelihoodfunction. Inatypicalproblemweknowthedatacomefromaparametricfamilyindexed byaparameter ,i.e. X 1 ;:::;X n i.i.d. f x j ,butwedon'tknow .Thejoint densityofallthedatais f X 1 ;:::;X n j = Y f X i j : .2 Equation2.2,asafunctionof ,isthelikelihoodfunction.Wesometimeswrite f Data j insteadofindicatingeachindividualdatum.Toemphasizethatweare thinkingofafunctionof wemayalsowritethelikelihoodfunctionas ` or ` j Data Theinterpretationofthelikelihoodfunctionisalwaysintermsofratios.If,for example, ` 1 =` 2 > 1 ,then 1 explainsthedatabetterthan 2 .If ` 1 =` 2 = k then 1 explainsthedata k timesbetterthan 2 .Toillustrate,supposestudentsina statisticsclassconductastudytoestimatethefractionofcarsonCampusDrivethat arered.StudentAdecidestoobservetherst10carsandrecord X ,thenumber thatarered.StudentAobserves NR;R;NR;NR;NR;R;NR;NR;NR;R andrecords X =3 .ShedidaBinomialexperiment;herstatisticalmodelis X Bin ; ;herlikelihoodfunctionis ` A = )]TJ/F39 7.9701 Tf 5.479 -4.379 Td [(10 3 3 )]TJ/F41 11.9552 Tf 12.593 0 Td [( 7 .Itisplottedin Figure2.18.Becauseonlyratiosmatter,thelikelihoodfunctioncanberescaledby anyarbitrarypositiveconstant.InFigure2.18ithasbeenrescaledsothemaximum is1.TheinterpretationofFigure2.18isthatvaluesof around 0 : 3 explain thedatabest,butthatanyvalueof intheintervalfromabout0.1toabout0.6 explainsthedatanottoomuchworsethanthebest.I.e., 0 : 3 explainsthedata onlyabout10timesbetterthan 0 : 1 or 0 : 6 ,andafactorof10isnotreally verymuch.Ontheotherhand,valuesof lessthanabout0.05orgreaterthan about0.7explainthedatamuchworsethan 0 : 3 Figure2.18wasproducedbythefollowingsnippet. theta<-seq0,1,by=.01#somevaluesoftheta y<-dbinom3,10,theta#calculateltheta

PAGE 147

2.3.LIKELIHOOD 134 Figure2.18:Likelihoodfunction ` fortheproportion ofredcarsonCampus Drive

PAGE 148

2.3.LIKELIHOOD 135 y<-y/maxy#rescale plottheta,y,type="l",xlab=expressiontheta, ylab="likelihoodfunction" expression is R 'swayofgettingmathematicalsymbolsandformulaeinto plotlabels.Formoreinformation,type helpplotmath Tocontinuetheexample,StudentBdecidestoobservecarsuntilthethirdred onedrivesbyandrecord Y ,thetotalnumberofcarsthatdrivebyuntilthethird redone.StudentsAandBwenttoCampusDriveatthesametimeandobserved thesamecars.Brecords Y =10 .ForBthelikelihoodfunctionis ` B =P[ Y =10 j ] =P[ 2redsamongrst9cars ] P[ 10'thcarisred ] = 9 2 2 )]TJ/F41 11.9552 Tf 11.956 0 Td [( 7 = 9 2 3 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 7 ` B differsfrom ` A bythemultiplicativeconstant )]TJ/F39 7.9701 Tf 5.48 -4.378 Td [(9 2 = )]TJ/F39 7.9701 Tf 5.48 -4.378 Td [(10 3 .Butsincemultiplicative constantsdon'tmatter,AandBreallyhavethesamelikelihoodfunctionandhence exactlythesameinformationabout .StudentBwouldalsouseFigure2.18asthe plotofherlikelihoodfunction. StudentCdecidestoobserveeverycarforaperiodof10minutesandrecord Z 1 ,..., Z k where k isthenumberofcarsthatdrivebyin10minutesandeach Z i iseither1or0accordingtowhetherthe i 'thcarisred.WhenCwenttoCampus DrivewithAandB,only10carsdrovebyintherst10minutes.ThereforeC recordedexactlythesamedataasAandB.Herlikelihoodfunctionis ` C =1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( = 3 )]TJ/F41 11.9552 Tf 11.956 0 Td [( 7 ` C isproportionalto ` A and ` B andhencecontainsexactlythesameinformation andlooksexactlylikeFigure2.18.Soeventhoughthestudentsplanneddifferent experimentstheyendedupwiththesamedata,andhencethesameinformation about ThenextexamplefollowstheSeedlingstoryandshowswhathappenstothe likelihoodfunctionasdataaccumulates.

PAGE 149

2.3.LIKELIHOOD 136 Example2.7 Seedlings,cont. Examples1.4,1.6,1.7,and1.9reporteddatafromasinglequadratonthenumberof newseedlingstoemergeinagivenyear.Infact,ecologistscollecteddatafrommultiple quadratsovermultipleyears.Intherstyeartherewere60quadratsandatotalof40 seedlingssothelikelihoodfunctionwas ` p Data j = p y 1 ;:::;y 60 j = 60 Y 1 p y i j = 60 Y 1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [( y i y i / e )]TJ/F39 7.9701 Tf 6.587 0 Td [(60 40 Notethat Q y i isamultiplicativefactorthatdoesnotdependon andsoisirrelevant to ` .Notealsothat ` dependsonlyon P y i ,notontheindividual y i 's.I.e.,we onlyneedtoknow P y i =40 ;wedon'tneedtoknowtheindividual y i 's. ` isplotted inFigure2.19.ComparetoFigure1.6pg.19.Figure2.19ismuchmorepeaked. That'sbecauseitreectsmuchmoreinformation,60quadratsinsteadof1.Theextra informationpinsdownthevalueof muchmoreaccurately. Figure2.19wascreatedwith lam<-seq0,2,length=50 lik<-dpois40,60*lam lik<-lik/maxlik plotlam,lik,xlab=expressionlambda, ylab="likelihood",type="l" ThenextexampleisaboutapossiblecancerclusterinCalifornia. Example2.8 SlaterSchool ThisexamplewasreportedinBrodeur[1992].SeeLavine[1999]forfurtheranalysis. TheSlaterschoolisanelementaryschoolinFresno,Californiawhereteachersandsta wereconcernedaboutthepresenceoftwohigh-voltagetransmissionlinesthatranpast theschool....TheirconcerncenteredonthehighincidenceofcanceratSlater....

PAGE 150

2.3.LIKELIHOOD 137 Figure2.19: ` after P y i =40 in60quadrats. Toaddresstheirconcern,Dr.RaymondNeutraoftheCaliforniaDepartmentofHealth Services'SpecialEpidemiologicalStudiesProgramconductedastatisticalanalysisonthe eightcasesofinvasivecancer,...,thetotalyearsofemploymentofthe hundredandforty-veteachers,teachers'aides,andstamembers,...,[and] thenumberofperson-yearsintermsofNationalCancerInstitutestatistics showingtheannualrateofinvasivecancerinAmericanwomenbetweenthe agesoffortyandforty-fourtheagegroupencompassingtheaverageage oftheteachersandstaatSlater[which]enabledhimtocalculatethat 4.2casesofcancercouldhavebeenexpectedtooccuramongtheSlater teachersandstamembers.... Forourpurposeswecanassumethat X ,thenumberofinvasivecancercasesatthe SlaterSchoolhastheBinomialdistribution X Bin ; .Weobserve x =8 .The likelihoodfunction ` / 8 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 137 .3 ispicturedinFigure2.20.FromtheFigureitappearsthatvaluesof around.05or.06, explainthedatabetterthanvalueslessthan.05orgreaterthan.06,butthatvaluesof anywherefromabout.02or.025uptoabout.11explainthedatareasonablywell.

PAGE 151

2.3.LIKELIHOOD 138 Figure2.20:LikelihoodforSlaterSchool Figure2.20wasproducedbythefollowing R code. theta<-seq0,.2,length=100 lik<-dbinom8,145,theta lik<-lik/maxlik plottheta,lik,xlab=expressiontheta, ylab="likelihood",type="l",yaxt="n" Therstlineofcodecreatesasequenceof100valuesof atwhichtocompute ` thesecondlinedoesthecomputation,thethirdlinerescalessothemaximumlikelihood is1,andthefourthlinemakestheplot. Examples2.7and2.8showhowlikelihoodfunctionsareused.Theyreveal whichvaluesofaparameterthedatasupportequivalently,whichvaluesofaparameterexplainthedatawellandvaluestheydon'tsupportwhichvaluesexplain thedatapoorly.Thereisnohardlinebetweensupportandnon-support.Rather, theplotofthelikelihoodfunctionsshowsthesmoothlyvaryinglevelsofsupport fordifferentvaluesoftheparameter. Becauselikelihoodratiosmeasurethestrengthofevidencefororagainstone

PAGE 152

2.3.LIKELIHOOD 139 hypothesisasopposedtoanother,itisimportanttoaskhowlargealikelihood rationeedstobebeforeitcanbeconsideredstrongevidence.Or,toputitanother way,howstrongistheevidenceinalikelihoodratioof10,or100,or1000,or more?Onewaytoanswerthequestionistoconstructa referenceexperiment oneinwhichwehaveanintuitiveunderstandingofthestrengthofevidenceand cancalculatethelikelihood;thenwecancomparethecalculatedlikelihoodtothe knownstrengthofevidence. Forourreferenceexperimentimaginewehavetwocoins.Oneisafaircoin,the otheristwo-headed.Werandomlychooseacoin.Thenweconductasequenceof cointossestolearnwhichcoinwasselected.Supposethetossesyield n consecutive Heads. P[ n Heads j fair ]=2 )]TJ/F42 7.9701 Tf 6.586 0 Td [(n ; P[ n Heads j two-headed ]=1 .Sothelikelihood ratiois 2 n .That'sourreferenceexperiment.Alikelihoodratioaround8islike tossingthreeconsecutiveHeads;alikelihoodratioaround1000isliketossingten consecutiveHeads. InExample2.8 argmax ` : 055 and ` : 025 =` : 055 : 13 1 = 8 ,sothe evidenceagainst = : 025 asopposedto = : 055 isaboutasstrongastheevidence againstthefaircoinwhenthreeconsecutiveHeadsaretossed.Thesamecanbe saidfortheevidenceagainst = : 1 .Similarly, ` : 011 =` : 055 ` : 15 =` : 055 : 001 ,sotheevidenceagainst = : 011 or = : 15 isaboutasstrongas10consecutive Heads.Afairstatementoftheevidenceisthat 'sintheintervalfromabout = : 025 toabout = : 1 explainthedatanotmuchworsethanthemaximumof : 055 .But 'sbelowabout.01orlargerthanabout.15explainthedatanotnearly aswellas 'saround.055. 2.3.2LikelihoodsfromtheCentralLimitTheorem Sometimesitisnotpossibletocomputethelikelihoodfunctionexactly,eitherbecauseitistoodifcultorbecausewedon'tknowwhatitis.Butwecanoften computeanapproximatelikelihoodfunctionusingtheCentralLimitTheorem.The followingexampleisthesimplestcase,buttypiesthemoreexoticcaseswewill seelateron. Supposewesample X 1 ;X 2 ;:::;X n fromaprobabilitydensity f .Wedon'tknow what f is;wedon'tevenknowwhatparametricfamilyitbelongsto.Assumethat f hasamean andanSD I.e,assumethatthemeanandvariancearenite. andthatwewouldliketolearnabout .If ; aretheonlyunknownparameters thenthelikelihoodfunctionis ` ; = f Data j ; = Q f X i j ; .Butwe don'tknow f andcan'tcalculate ` ; However,wecanreasonasfollows.

PAGE 153

2.3.LIKELIHOOD 140 1.Mostoftheinformationinthedataforlearningabout iscontaininedin X Thatis, X tellsusalotabout andthedeviations i X i )]TJ/F15 11.9552 Tf 15.567 3.022 Td [( X;i =1 ;:::;n tellusverylittle. 2.If n islargethentheCentralLimitTheoremtellsus X N ;= p n ; approximately 3.Wecanestimate 2 fromthedataby ^ 2 = s 2 = X 2 i =n 4.Andthereforethefunction ` M / exp )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 2 )]TJ/F15 11.9552 Tf 15.331 3.022 Td [( X ^ = p n 2 .4 isagoodapproximationtothelikelihoodfunction. Intheprecedingreasoningweseparatedthedataintotwoparts X and f i g ; used f i g toestimate ;andused X tondalikelihoodfunctionfor .Wecannot, ingeneral,justifysuchaseparationmathematically.Wejustieditifandwhenour maininterestisin andwebelieve f i g telluslittleabout Function2.4iscalleda marginallikelihood function.TsouandRoyall[1995] showthatmarginallikelihoodsaregoodapproximationstotruelikelihoodsand canbeusedtomakeaccurateinferences,atleastincaseswheretheCentralLimit Theoremapplies.Weshallusemarginallikelihoodsthroughoutthisbook. Example2.9 SlaterSchool,continued WeredotheSlaterSchoolexampleExample2.8toillustratethemarginallikelihood andseehowitcomparestotheexactlikelihood.Inthatexamplethe X i 'swere1's and0'sindicatingwhichteachersgotcancer.Therewere81'soutof145teachers,so X =8 = 145 : 055 .Also, ^ 2 = = 145 2 +1378 = 145 2 = 145 : 052 ,so ^ : 23 Weget ` M / exp )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 2 )]TJ/F41 11.9552 Tf 11.956 0 Td [(: 055 : 23 = p 145 2 .5 Figure2.21showsthemarginalandexactlikelihoodfunctions.Themarginallikelihood isareasonablygoodapproximationtotheexactlikelihood.

PAGE 154

2.3.LIKELIHOOD 141 Figure2.21:MarginalandexactlikelihoodsforSlaterSchool Figure2.21wasproducedbythefollowingsnippet. theta<-seq0,.2,length=100 lik<-dbetatheta,9,138 lik.mar<-dnormtheta,8/145, sqrt*7/145^2+137*/145^2/145/sqrt lik<-lik/maxlik lik.mar<-lik.mar/maxlik.mar matplottheta,cbindlik,lik.mar,xlab=expressionmu, ylab="likelihood",type="l",lty=c2,1,col=1 legend.1,1,c"marginal","exact",lty=c,2 Example2.10 CEOsalary HowmucharecorporateCEO'spaid?Forbesmagazinecollecteddatain1993thatcan begintoanswerthisquestion.Thedataareavailableon-lineat DASL ,the Dataand StoryLibrary ,acollectionofdatasetsforfreeusebystatisticsstudents. DASL says Forbesmagazinepublisheddataonthebestsmallrmsin1993.Thesewere rmswithannualsalesofmorethanveandlessthan$350million.Firms

PAGE 155

2.3.LIKELIHOOD 142 wererankedbyve-yearaveragereturnoninvestment.Thedataextracted aretheageandannualsalaryofthechiefexecutiveocerfortherst60 rankedrms.Inquestionarethedistributionpatternsfortheagesandthe salaries. Youcandownloadthedatafrom http://lib.stat.cmu.edu/DASL/Datafiles/ceodat.html .Therstfewlineslook likethis: AGESAL 53145 43621 33262 InthisexamplewetreattheForbesdataasarandomsampleofsize n =60 ofCEO salariesforsmallrms.We'reinterestedintheaveragesalary .Ourapproachisto calculatethemarginallikelihoodfunction ` M Figure2.22ashowsastripchartofthedata.Evidently,mostsalariesareintherange of$200to$400thousanddollars,butwithalongright-handtail.Becausetherighthandtailissomuchlargerthantheleft,thedataarenotevenapproximatelyNormally distributed.ButtheCentralLimitTheoremtellsusthat X is approximatelyNormally distributed,sothemethodofmarginallikelihoodapplies.Figure2.22bdisplaysthe marginallikelihoodfunction ` M Figure2.22wasproducedbythefollowingsnippet. ceo<-read.table"data/ceo_salaries/data",header=T parmfrow=c,1 stripchartceo$SAL,"jitter",pch=1,main="a", xlab="Salarythousandsofdollars" m<-meanceo$SAL,na.rm=T s<-sqrtvarceo$SAL,na.rm=T/lengthceo$SAL-1 x<-seq340,470,length=40 y<-dnormx,m,s y<-y/maxy plotx,y,type="l",xlab="meansalary",

PAGE 156

2.3.LIKELIHOOD 143 Figure2.22:MarginallikelihoodformeanCEOsalary

PAGE 157

2.3.LIKELIHOOD 144 ylab="likelihood",main="b" In s<-sqrt... the lengthceo$SAL-1 istheretoaccountforonemissing datapoint. y<-y/maxy doesn'taccomplishmuchandcouldbeomitted. Thedatastronglysupporttheconclusionthatthemeansalaryisbetweenabout $350and$450thousanddollars.That'smuchsmallerthantherangeofsalaries ondisplayinFigure2.22a.Why? Isinferenceaboutthemeansalaryusefulinthisdataset?Ifnot,whatwouldbe better? 2.3.3Likelihoodsforseveralparameters Whatiftherearetwounknownparameters?Thenthelikelihoodisafunctionof twovariables.Forexample,ifthe X i 'sareasamplefromN ; thenthelikelihood isafunctionof ; .Thenextexampleillustratesthepoint. Example2.11 FACE,continued ThisexamplecontinuesExample1.12aboutaFACEexperimentinDukeForest.There weresixrings;threeweretreatedwithexcessCO 2 .Thedominantcanopytreeinthe FACEexperimentis pinustaeda ,orloblollypine.Figure2.23aisahistogramofthenal basalareaofeachloblollypinein1998dividedbyitsinitialbasalareain1996.Itshows thatthetreesinRing1grewanaverageofabout30%butwithvariabilitythatranged fromcloseto0%onthelowendtoaround50%or60%onthehighend.Becausethe dataareclusteredaroundacentralvalueandfalloroughlyequallyonbothsidesthey canbewellapproximatedbyaNormaldistribution.ButwithwhatmeanandSD?What valuesof ; mightreasonablyproducethehistograminFigure2.23a?

PAGE 158

2.3.LIKELIHOOD 145 Thelikelihoodfunctionis ` ; = n Y 1 f x i j ; = n Y 1 1 p 2 e )]TJ/F40 5.9776 Tf 12.107 3.258 Td [(1 2 2 x i )]TJ/F42 7.9701 Tf 6.587 0 Td [( 2 / )]TJ/F42 7.9701 Tf 6.587 0 Td [(n e )]TJ/F40 5.9776 Tf 12.107 3.258 Td [(1 2 2 P n 1 x i )]TJ/F42 7.9701 Tf 6.586 0 Td [( 2 Figure2.23bisacontourplotofthelikelihoodfunction.Thedotinthecenter,where ; : 27 ;: 098 ,iswherethelikelihoodfunctionishighest.Thatisthevalueof ; thatbestexplainsthedata.Thenextcontourlineisdrawnwherethelikelihood isabout1/4ofitsmaximum;thenthenextisat1/16themaximum,thenextat1/64, andthelastat1/256ofthemaximum.Theyshowvaluesof ; thatexplainthedata lessandlesswell. Ecologistsareprimarilyinterestedin becausetheywanttocomparethe 'sfrom dierentringstoseewhethertheexcessCO 2 hasaectedtheaveragegrowthrate. They'realsointerestedinthe 's,butthat'sasecondaryconcern.But ` isafunction ofboth and ,soit'snotimmediatelyobviousthatthedatatellusanythingabout byitself.Toinvestigatefurther,Figure2.23cshowsslicesthroughthelikelihoodfunction at = : 09 ;: 10 ; and : 11 ,thelocationsofthedashedlinesinFigure2.23b.Thethree curvesarealmostidentical.Therefore,therelativesupportfordierentvaluesof does notdependverymuchonthevalueof ,andthereforewearejustiedininterpreting anyofthecurvesinFigure2.23casalikelihoodfunctionfor alone,showinghowwell dierentvaluesof explainthedata.Inthiscase,itlooksasthoughvaluesof inthe interval : 25 ; 1 : 28 explainthedatamuchbetterthanvaluesoutsidethatinterval. Figure2.23wasproducedwith parmfrow=c,2#a2by2arrayofplots x<-ba98$BA.final/ba96$BA.init x<-x[!is.nax] histx,prob=T,xlab="basalarearatio", ylab="",main="a" mu<-seq1.2,1.35,length=50 sd<-seq.08,.12,length=50 lik<-matrixNA,50,50 foriin1:50 forjin1:50

PAGE 159

2.3.LIKELIHOOD 146 Figure2.23:FACEExperiment,Ring1. a :nalbasalarea initialbasalarea; b :contoursofthelikelihoodfunction. c :slicesof thelikelihoodfunction.

PAGE 160

2.3.LIKELIHOOD 147 lik[i,j]=proddnormx,mu[i],sd[j] lik<-lik/maxlik contourmu,sd,lik,levels=4^-4:0,drawlabels=F, xlab=expressionmu,ylab=expressionsigma, main="b" ablineh=c.09,.1,.11,lty=2 lik.09<-lik[,13]/maxlik[,13] lik.10<-lik[,26]/maxlik[,26] lik.11<-lik[,38]/maxlik[,38] matplotmu,cbindlik.09,lik.10,lik.11,type="l", col=1,main="c", xlab=expressionmu,ylab="likelihood" Theline x<-x[!is.nax] istherebecausesomedataismissing.Thisline selectsonlythosedatathatarenotmissingandkeepsthemin x .When x isa vector, is.nax isanothervector,thesamelengthas x ,with TRUE or FALSE indicatingwhere x ismissing.The isnot,ornegation,so x[!is.nax] selectsonlythosevaluesthatarenotmissing. Thelines mu<-... and sd<-... createagridof and valuesatwhichto evaluatethelikelihood. Theline lik<-matrixNA,50,50 createsamatrixforstoringthevalues of ` ; onthegrid.Thenextthreelinesarealooptocalculatethevaluesand puttheminthematrix. Theline lik<-lik/maxlik rescalesallthevaluesinthematrixsothe maximumvalueis1.Rescalingmakesiteasiertosetthe levels inthenextline. contour producesacontourplot. contourmu,sd,lik,... speciesthe valuesonthex-axis,thevaluesonthey-axis,andamatrixofvaluesonthe grid.The levels argumentsaysatwhatlevelstodrawthecontourlines,while drawlabels=F saysnottoprintnumbersonthoselines.Makeyourowncontour plotwithoutusing drawlabels toseewhathappens. abline isusedforaddinglinestoplots.Youcansayeither ablineh=... or ablinev=... togethorizontalandverticallines,or ablineintercept,slope togetarbitrarylines.

PAGE 161

2.3.LIKELIHOOD 148 lik.09 lik.10 ,and lik.11 pickoutthreecolumnsfromthe lik matrix.They arethethreecolumnsforthevaluesof closestto = : 09 ;: 10 ;: 11 .Eachcolumn isrescaledsoitsmaximumis1. Example2.12 QuizScores,continued ThisexamplecontinuesExample2.3aboutscoresinStatistics103.Figure2.7shows thatmoststudentsscoredbetweenabout5and10,while4studentswerewellbelow therestoftheclass.Infact,thosestudentsdidnotshowupforeveryquizsotheir averageswerequitelow.Buttheremainingstudents'scoreswereclusteredtogetherin awaythatcanbeadequatelydescribedbyaNormaldistribution.Whatdothedatasay about ; ? Figure2.24showsthelikelihoodfunction.Thedatasupportvaluesof fromabout 7.0toabout7.6andvaluesof fromabout0.8toabout1.2.Agooddescriptionofthe dataisthatmostofitfollowsaNormaldistributionwith ; intheindicatedintervals, exceptfor4studentswhohadlowscoresnotttingthegeneralpattern.Doyouthink theinstructorshouldusethisanalysistoassignlettergradesand,ifso,how? Figure2.24wasproducedby x<-sortscores.ave[5:58] mu<-seq6.8,7.7,length=60 sig<-seq.7,1.3,length=60 lik<-matrixNA,60,60 foriin1:60 forjin1:60 lik[i,j]<-proddnormx,mu[i],sig[j] lik<-lik/maxlik contourmu,sig,lik,xlab=expressionmu, ylab=expressionsigma Examples2.11and2.12havelikelihoodcontoursthatareroughlycircular,indicatingthatthelikelihoodfunctionforoneparameterdoesnotdependverystrongly onthevalueoftheotherparameter,andsowecangetafairlyclearpictureofwhat thedatasayaboutoneparameterinisolation.Butinotherdatasetstwoparametersmaybeinextricablyentwined.Example2.13illustratestheproblem.

PAGE 162

2.3.LIKELIHOOD 149 Figure2.24:LikelihoodfunctionforQuizScores

PAGE 163

2.3.LIKELIHOOD 150 Example2.13 Seedlings,continued Examples1.4,1.6,and1.7introducedanobservationalstudybyecologiststolearn abouttreeseedlingemergenceandsurvival.Somespecies,RedMapleor acerrubrum for example,getamarkcalleda budscalescar whentheylosetheirleavesoverwinter.By lookingforbudscalescarsecologistscanusuallytellwhetheran acerrubrum seedlingis Newinitsrstsummer,orOldalreadysurvivedthroughatleastonewinter.When theymaketheirannualobservationstheyrecordthenumbersofNewandOld acerrubrum seedlingsineachquadrat.EveryOldseedlinginyear t musthavebeeneitheraNewor anOldseedlinginyear t )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 Table2.1showsthe19921993dataforquadrat6.Clearlythedataareinconsistent; wheredidtheOldseedlingcomefromin1993?Whenconfrontedwiththisparadoxthe ecologistsexplainedthatsomeNewseedlingsemergefromthegroundafterthedateof theFallcensusbutbeforethewinter.Thustheyarenotcountedinthecensustheirrst year,butdevelopabudscalescarandarecountedasOldseedlingsintheirsecondyear. Onesuchseedlingmusthaveemergedin1992,accountingfortheOldseedlingin1993. YearNo.ofNewseedlingsNo.ofOldseedlings 199200 199301 Table2.1:NumbersofNewandOldseedlingsinquadrat6in1992and1993. Howshallwemodelthedata?Let N T i bethetruenumberofNewseedlingsinyear i ,i.e.,includingthosethatemergeafterthecensus;andlet N O i betheobservednumber ofseedlingsinyear i ,i.e.,thosethatarecountedinthecensus.AsinExample1.4we model N T i Poi .Furthermore,eachseedlinghassomechance f ofbeingfoundin thecensus.Nominally f istheproportionofseedlingsthatemergebeforethecensus, butinfactitmayalsoincludeacomponentaccountingforthefailureofecologiststo ndseedlingsthathavealreadyemerged.Treatingtheseedlingsasindependentandall havingthesame f leadstothemodel N O i Bin N T i ; f .Thedataarethe N O i 's;the N T i 'sarenotobserved.Whatdothedatatellusaboutthetwoparameters ; f ? IgnoretheOldseedlingsfornowandjustlookat1992data N O 1992 =0 .Droppingthe

PAGE 164

2.3.LIKELIHOOD 151 subscript1992,thelikelihoodfunctionis ` ; f =P[ N O =0 j ; f ] = 1 X n =0 P[ N O =0 ;N T = n j ; f ] = 1 X n =0 P[ N T = n j ]P[ N O =0 j N T = n; f ] = 1 X n =0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [( n n )]TJ/F41 11.9552 Tf 11.955 0 Td [( f n = 1 X n =0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [( )]TJ/F42 7.9701 Tf 6.586 0 Td [( f )]TJ/F41 11.9552 Tf 11.955 0 Td [( f n e f n = e )]TJ/F42 7.9701 Tf 6.587 0 Td [( f .6 Figure2.25aplots log 10 ` ; f .Weplotted log 10 ` insteadof ` forvariety.The contourlinesarenotcircular.Toseewhatthatmeans,focusonthecurve log 10 ` ; f = )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 whichrunsfromabout ; f =2 : 5 ; 1 toabout ; f = ;: 4 .Points ; f alongthatcurveexplainthedatum N O =0 about1/10aswellasthem.l.e.Them.l.e. isanypairwhereeither =0 or f =0 .Pointsbelowandtotheleftofthatcurve explainthedatumbetterthan1/10ofthemaximum. Themainparameterofecologicalinterestis ,therateatwhichNewseedlingstend toarrive.Thegureshowsthatvaluesof aslargeas6canhavereasonablylarge likelihoodsandhenceexplainthedatareasonablywell,atleastifwebelievethat f mightbeassmallas.4.Toinvestigatefurther,Figure2.25bissimilarto2.25abut includesvaluesof aslargeas1000.Itshowsthatevenvaluesof aslargeas1000can havereasonablylargelikelihoodsifthey'reaccompaniedbysucientlysmallvaluesof f Infact,arbitrarilylargevaluesof coupledwithsucientlysmallvaluesof f canhave arbitrarilylargelikelihoods.Sofromthedataalone,thereisnowaytoruleoutextremely largevaluesof .Ofcourseextremelylargevaluesof don'tmakeecologicalsense,both intheirownrightandbecauseextremelysmallvaluesof f arealsonotsensible.Scientic backgroundinformationofthistypeisincorporatedintostatisticalanalysisoftenthrough BayesianinferenceSection2.5.Butthepointhereisthat and f arelinked,andthe dataalonedoesnottellusmuchabouteitherparameterindividually. Figure2.25awasproducedwiththefollowingsnippet. lam<-seq0,6,by=.1

PAGE 165

2.3.LIKELIHOOD 152 Figure2.25:Logofthelikelihoodfunctionfor ; f inExample2.13

PAGE 166

2.3.LIKELIHOOD 153 th<-seq0,1,by=.02 lik<-matrixNA,lengthlam,lengthth foriinseqalong=lam forjinseqalong=th lik[i,j]<-exp-lam[i]*th[j] contourlam,th,log10lik, levels=c,-.2,-.6,-1,-1.5,-2, xlab=expressionlambda, ylab=expressiontheta[f],main="a" log10 computesthebase10logarithm. Figure2.25bwasproducedwiththefollowingsnippet. lam2<-seq0,1000,by=1 lik2<-matrixNA,lengthlam2,lengthth foriinseqalong=lam2 forjinseqalong=th lik2[i,j]<-exp-lam2[i]*th[j] contourlam2,th,log10lik2,levels=c,-1,-2,-3, xlab=expressionlambda, ylab=expressiontheta[f],main="b" Wehavenowseentwoexamples2.11and2.12inwhichlikelihoodcontours areroughlycircularandone2.13inwhichthey'renot.Byfarthemostcommon andimportantcaseissimilartoExample2.11becauseitapplieswhentheCentral LimitTheoremapplies.Thatis,therearemanyinstancesinwhichwearetrying tomakeaninferenceaboutaparameter andcaninvoketheCentralLimitTheoremsayingthatforsomestatistic t t N ; t approximatelyandwherewecan estimate t .Inthesecaseswecan,ifnecessary,ignoreanyotherparametersinthe problemandmakeaninferenceabout basedon ` M .

PAGE 167

2.4.ESTIMATION 154 2.4Estimation Sometimesthepurposeofastatisticalanalysisistocomputeasinglebestguess ataparameter .Aninformedguessatthevalueof iscalledan estimate and denoted ^ .Onewaytoestimate istond ^ argmax ` ,thevalueof for which ` islargestandhencethevalueof thatbestexplainsthedata.That's thesubjectofSection2.4.1. 2.4.1TheMaximumLikelihoodEstimate Inmanystatisticsproblemsthereisauniquevalueof thatmaximizes ` .This valueiscalledthe maximumlikelihoodestimate ,orm.l.e.of anddenoted ^ ^ argmax p y j =argmax ` : Forinstance,inExample2.8andFigure2.20 wastherateofcanceroccurence andwecalculated ` basedon y =8 cancersin145people.Figure2.20suggests thatthem.l.e.isabout ^ : 05 : When ` isdifferentiable,them.l.e.canbefoundbydifferentiatingandequatingtozero.InExample2.8thelikelihoodwas ` / 8 )]TJ/F41 11.9552 Tf 12.206 0 Td [( 137 .Thederivative is d ` d / 8 7 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 137 )]TJ/F15 11.9552 Tf 11.955 0 Td [(137 8 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 136 = 7 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 136 [8 )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F15 11.9552 Tf 11.955 0 Td [(137 ] .7 Equatingto0yields 0=8 )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F15 11.9552 Tf 11.955 0 Td [(137 145 =8 =8 = 145 : 055 So ^ : 055 isthem.l.e.Ofcourseifthemodeisat,therearemultiplemodes, themaximumoccursatanendpoint,or ` isnotdifferentiable,thenmorecareis needed. Equation2.7showsmoregenerallythem.l.e.forBinomialdata.Simplyreplace 137with n )]TJ/F41 11.9552 Tf 12.193 0 Td [(y and8with y toget ^ = y=n .IntheExercisesyouwillbeaskedto ndthem.l.e.fordatafromothertypesofdistributions. Thereisatrickthatisoftenusefulforndingm.l.e.'s.Because log isamonotone function, argmax ` =argmaxlog ` ,sothem.l.e.canbefoundbymaximizing

PAGE 168

2.4.ESTIMATION 155 log ` .Fori.i.d.data, ` = Q p y i j log ` = P log p x i j ,anditisofteneasier todifferentiatethesumthantheproduct.FortheSlaterexamplethemathwould looklikethis: log ` =8log +137log )]TJ/F41 11.9552 Tf 11.955 0 Td [( d log ` d = 8 )]TJ/F15 11.9552 Tf 17.494 8.088 Td [(137 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 137 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( = 8 137 =8 )]TJ/F15 11.9552 Tf 11.956 0 Td [(8 = 8 145 : Equation2.7showsthatif y 1 ;:::;y n Bern thenthem.l.e.of is ^ = n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X y i = samplemean TheExercisesaskyoutoshowthefollowing. 1.If y 1 ;:::;y n N ; thenthem.l.e.of is ^ = n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X y i = samplemean 2.If y 1 ;:::;y n Poi thenthem.l.e.of is ^ = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X y i = samplemean 3.If y 1 ;:::;y n Exp thenthem.l.e.of is ^ = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X y i = samplemean 2.4.2AccuracyofEstimation Findingthem.l.e.isnotenough.Statisticianswanttoquantifytheaccuracyof ^ asanestimateof .Inotherwords,wewanttoknowwhatothervaluesof ,in additionto ^ ,havereasonablyhighlikelihoodprovideareasonablygoodexplanationofthedata.Andwhatdoesreasonablemean?Section2.4.2addresses thisquestion.

PAGE 169

2.4.ESTIMATION 156 Aswesawfromthereferenceexperimentinsection2.3,theevidenceisnot verystrongagainstanyvalueof suchthat ` >` ^ = 10 .Sowhenconsidering estimationaccuracyitisusefultothinkaboutsetssuchas LS : 1 : ` ` ^ : 1 LSstandsfor likelihoodset .Moregenerally,forany 2 ; 1 wedenethelikelihoodsetoflevel tobe LS : ` ` ^ LS isthesetof 'sthatexplainthedatareasonblywell,andthereforethesetof 'sbestsupportedbythedata,wherethequanticationofreasonableandbest aredeterminedby .Thenotionisonlyapproximateandmeantasaheuristic reference;inrealitythereisnostrictcutoffbetweenreasonableandunreasonable valuesof .Also,thereisnouniquelybestvalueof .Wefrequentlyuse : 1 for convenienceandcustom. Inmanyproblemsthelikelihoodfunction ` iscontinuousandunimodal,i.e. strictlydecreasingawayfrom ^ ,andgoesto0as ,asinFigures2.19 and2.20.Inthesecases, ^ ` ` ^ .Sovaluesof closeto ^ explainthe dataalmostaswellasandareaboutasplausibleas ^ andLS isaninterval LS =[ l ; u ] where l and u arethelowerandupperendpoints,respectively,oftheinterval. InExample2.9SlaterSchool ^ =8 = 145 ,sowecannd ` ^ onacalculator, orbyusing R 'sbuilt-infunction dbinom8,145,8/145 whichyieldsabout.144.Then l and u canbefoundbytrialanderror.Since dbinom,145,.023 : 013 and dbinom,145,.105 : 015 ,weconcludethat LS : 1 [ : 023 ;: 105] isaroughlikelihoodintervalfor .ReviewFigure2.20tosee whetherthisintervalmakessense. ThedatainExample2.9couldpindown toanintervalofwidthabout.08. Ingeneral,anexperimentwillpindown toanextentdeterminedbytheamount ofinformationinthedata.Asdataaccumulatessodoesinformationandtheabilitytodetermine .Typicallythelikelihoodfunctionbecomesincreasinglymore peakedas n !1 ,leadingtoincreasinglyaccurateinferencefor .Wesawthatin Figures1.6and2.19.Example2.14illustratesthepointfurther.

PAGE 170

2.4.ESTIMATION 157 Example2.14 Craps,continued Example1.10introducedacomputersimulationtolearntheprobability ofwinningthe gameofcraps.Inthisexampleweusethatsimulationtoillustratetheeectofgathering everincreasingamountsofdata.We'llstartbyrunningthesimulationjustafewtimes, andexaminingthelikelihoodfunction ` .Thenwe'lladdmoreandmoresimulations andseewhathappensto ` TheresultisinFigure2.26.Theattestcurveisfor3simulations,andthecurves becomeincreasinglypeakedfor9,27,and81simulations.Afteronly3simulations LS : 1 [ : 15 ;: 95] isquitewide,reectingthesmallamountofinformation.Butafter 9simulations ` hassharpenedsothatLS : 1 [ : 05 ;: 55] ismuchsmaller.After27 simulationsLS : 1 hasshrunkfurthertoabout [ : 25 ;: 7] ,andafter81ithasshrunkeven furthertoabout [ : 38 ;: 61] Figure2.26:Likelihoodfunctionfortheprobability ofwinningagameofcraps. Thefourcurvesarefor3,9,27,and81simulations.

PAGE 171

2.4.ESTIMATION 158 Figure2.26wasproducedwiththefollowingsnippet. n.sim<-c3,9,27,81 th<-seq0,1,length=200 lik<-matrixNA,200,lengthn.sim foriinseqalong=n.sim{ wins<-0 forjin1:n.sim[i] wins<-wins+sim.craps lik[,i]<-dbinomwins,n.sim[i],th lik[,i]<-lik[,i]/maxlik[,i] } matplotth,lik,type="l",col=1,lty=1:4, xlab=expressiontheta,ylab="likelihood" InFigure2.26thelikelihoodfunctionlooksincreasinglylikeaNormaldensity asthenumberofsimulationsincreases.Thatisnoaccident;itisthetypicalbehaviorinmanystatisticsproblems.Section2.4.3explainsthereason. 2.4.3Thesamplingdistributionofanestimator Theestimator ^ isafunctionofthedata y 1 ;:::;y n .Ifwerepeattheexperimentand getnewdatawealsogetanew ^ .So ^ isarandomvariableandhasadistribution calledthe samplingdistribution of ^ anddenoted F ^ .Westudied F ^ inExample1.11 whereweusedsimulationtoestimatetheprobability ofwinningagameofcraps. Foreachsamplesizeof n =50 ; 200 ; 1000 wedid1000simulations.Eachsimulation yieldedadifferent ^ .Those1000 ^ 'sarearandomsampleofsize1000from F ^ Figure1.19showedboxplotsofthesimulations. Nowweexaminethesamplingdistributionof ^ inmoredetail.Thereareat leasttworeasonsfordoingso.First, F ^ isanotherway,inadditiontolikelihood sets,ofassessingtheaccuracyof ^ asanestimatorof .If F ^ istightlyconcentrated around then ^ ishighlyaccurate.Conversely,if F ^ ishighlydispersed,ornot centeredaround ,then ^ isaninaccurateestimator.Second,wemaywantto comparetwopossibleestimators.I.e.,iftherearetwopotentialestimators ^ 1 and

PAGE 172

2.4.ESTIMATION 159 ^ 2 ,wecancompare F ^ 1 and F ^ 2 andusetheestimatorwhosesamplingdistribution ismosttightlyconcentratedaround Toillustrate,let'ssupposewesample y 1 ;:::;y n fromdistribution F Y ,andwant toestimate E [ Y ] .Weconsidertwopotentialestimators,thesamplemean ^ 1 = =n P y i andthesamplemedian ^ 2 .Toseewhichestimatorisbetterwe doasimulation,asshowninthefollowingsnippet.Thesimulationisdoneatfour differentsamplesizes, n =4 ; 16 ; 64 ; 256 ,toseewhethersamplesizematters.Here we'lllet F Y beN ; 1 .Butthechoicebetween ^ 1 and ^ 2 mightdependonwhat F Y is,soamorethoroughinvestigationwouldconsiderotherchoicesof F Y Wedo1000simulationsateachsamplesize.Figure2.27showstheresult.The guresuggeststhatthesamplingdistributionsofboth ^ 1 and ^ 2 arecenteredatthe truevalueof .Thedistributionof ^ 1 isslightlylessvariablethanthatof ^ 2 ,but notenoughtomakemuchpracticaldifference. Figure2.27wasproducedbythefollowingsnippet. sampsize<-c4,16,64,256 n.sim<-1000 parmfrow=c,2 foriinseqalong=sampsize{ y<-matrixrnormn.sim*sampsize[i],0,1, nrow=sampsize[i],ncol=n.sim that.1<-applyy,2,mean that.2<-applyy,2,median boxplotthat.1,that.2,names=c"mean","median", main=paste"",letters[i],"",sep="" ablineh=0,lty=2 } Forus,comparing ^ 1 to ^ 2 isonlyasecondarypointofthesimulation.Themain pointisfour-fold. 1.Anestimator ^ isarandomvariableandhasadistribution. 2. F ^ isaguidetoestimationaccuracy. 3.Statisticiansstudyconditionsunderwhichoneestimatorisbetterthananother.

PAGE 173

2.4.ESTIMATION 160 Figure2.27:Samplingdistributionof ^ 1 ,thesamplemeanand ^ 2 ,thesample median.Fourdifferentsamplesizes. a :n=4; b :n=16; c :n=64; d : n=256

PAGE 174

2.4.ESTIMATION 161 4.Simulationisuseful. Whenthem.l.e.isthesamplemean,asitiswhen F Y isaBernoulli,Normal, PoissonorExponentialdistribution,theCentralLimitTheoremtellsusthatinlarge samples, ^ isapproximatelyNormallydistributed.Therefore,inthesecases,its distributioncanbewelldescribedbyitsmeanandSD.Approximately, ^ N ^ ; ^ : where ^ = Y ^ = Y p n .8 bothofwhichcanbeeasilyestimatedfromthesample.Sowecanusethesample tocomputeagoodapproximationtothesamplingdistributionofthem.l.e. Toseethatmoreclearly,let'smake1000simulationsofthem.l.e.in n = 5 ; 10 ; 25 ; 100 Bernoullitrialswith p = : 1 .We'llmakehistogramsofthosesimulationsandoverlaythemwithkerneldensityestimatesandNormaldensities.The parametersoftheNormaldensitieswillbeestimatedfromthesimulations.Results areshowninFigure2.28. Figure2.28wasproducedbythefollowingsnippet. sampsize<-c5,10,25,100 n.sim<-1000 p.true<-.1 parmfrow=c2,2 foriinseqalong=sampsize{ #n.simBernoullisamplesofsampsize[i] y<-matrixrbinomn.sim*sampsize[i],1,p.true, nrow=n.sim,ncol=sampsize[i] #foreachsample,computethemean t.hat<-applyy,1,mean #histogramofthetahat histt.hat,prob=T, xlim=c,.6,xlab=expressionhattheta, ylim=c,14,ylab="density",

PAGE 175

2.4.ESTIMATION 162 Figure2.28:Histogramsof ^ ,thesamplemean,forsamplesfromBin n;: 1 .Dashed line:kerneldensityestimate.Dottedline:Normalapproximation. a :n=4; b : n=16; c :n=64; d :n=256

PAGE 176

2.4.ESTIMATION 163 main=paste"",letters[i],"",sep="" #kerneldensityestimateofthetahat linesdensityt.hat,lty=2 #Normalapproximationtodensityofthetahat, #calculatedfromthefirstsample m<-meany[1,] sd<-sdy[1,]/sqrtsampsize[i] t<-seqmint.hat,maxt.hat,length=40 linest,dnormt,m,sd,lty=3 } NoticethattheNormalapproximationisnotverygoodforsmall n .That'sbecausetheunderlyingdistribution F Y ishighlyskewed,nothingatalllikeaNormal distribution.Infact, R wasunabletocomputetheNormalapproximationfor n =5 Butforlarge n ,theNormalapproximationisquitegood.That'stheCentralLimit Theoremkickingin.Forany n ,wecanusethesampletoestimatetheparameters inEquation2.8.Forsmall n ,thoseparametersdon'thelpusmuch.Butfor n =256 theytellusalotabouttheaccuracyof ^ ,andtheNormalapproximationcomputed fromtherstsampleisagoodmatchtothesamplingdistributionof ^ TheSDofanestimatorisgivenaspecialname.It'scalledthe standarderror or SEoftheestimatorbecauseitmeasuresthetypicalsizeofestimationerrors j ^ )]TJ/F41 11.9552 Tf 11.21 0 Td [( j When ^ N ^ ; ^ ,approximately,then ^ istheSE.ForanyNormaldistribution, about95%ofthemassiswithin 2 standarddeviationsofthemean.Therefore, Pr[ j ^ )]TJ/F41 11.9552 Tf 11.955 0 Td [( j 2 ^ ] : 95 Inotherwords,estimatesareaccuratetowithinabouttwostandarderrorsabout 95%ofthetime,atleastwhenNormaltheoryapplies. Wehavenowseentwowaysofassessingestimationaccuracythrough ` andthrough F ^ .Oftenthesetwoapparentlydifferentapproachesalmostcoincide. Thathappensunderthefollowingconditions. 1.When ^ N ; ^ ,and ^ = p n ,anapproximationoftenjustiedbythe CentralLimitTheorem,thenwecanestimate towithinabout 2 ,around

PAGE 177

2.5.BAYESIANINFERENCE 164 95%ofthetime.Sotheinterval ^ )]TJ/F15 11.9552 Tf 11.985 0 Td [(2 ^ ; ^ +2 ^ isareasonableestimation interval. 2.Whenmostoftheinformationinthedatacomefromthesamplemean,and inothercaseswhenamarginallikelihoodargumentapplies,then ` exp )]TJ/F39 7.9701 Tf 10.494 4.707 Td [(1 2 )]TJ/F39 7.9701 Tf 7.864 2.015 Td [( Y ^ = p n 2 Equation2.4andLS : 1 ^ )]TJ/F15 11.9552 Tf 12.22 0 Td [(2 ^ ; ^ +2 ^ .Sothetwo intervalsareaboutthesame. 2.5BayesianInference TheessenceofBayesianinferenceisusingprobabilitydistributionstodescribeour stateofknowledgeofsomeparameterofinterest, .Weconstruct p ,eithera pmforpdf,toreectourknowledgebymaking p largeforthosevaluesof thatseemmostlikely,and p smallforthosevaluesof thatseemleastlikely, accordingtoourstateofknowledge.Although p isaprobabilitydistribution,it doesn'tnecessarilymeanthat isarandomvariable.Rather, p encodesourstate ofknowledge.Anddifferentpeoplecanhavedifferentstatesofknowledge,hence differentprobabilitydistributions.Forexample,supposeyoutossafaircoin,look atit,butdon'tshowittome.Theoutcomeisnotrandom;ithasalreadyoccured andyouknowwhatitis.Butforme,eachoutcomeisequallylikely.Iwouldencode mystateofknowledgebyassigning P H =P T =1 = 2 .Youwouldencodeyour stateofknowledgebyassigningeither P H =1 or P T =1 accordingtowhether thecoinwasHeadsorTails.AfterIseethecoinIwouldupdatemyprobabilitiesto bethesameasyours. Foranothercommonexample,considerhorseracing.Whenabettorplacesa betat10to1,sheispaying$1foraticketthatwillpay$10ifthehorsewins.Her expectedpayoffforthatbetis )]TJ/F15 11.9552 Tf 9.298 0 Td [($1+P[ horsewins ] $10 .Forthattobeagood dealshemustthinkthat P[ horsewins ] : 1 .Ofcourseotherbettorsmaydisagree. Herearesomeotherexamplesinwhichprobabilitydistributionsmustbeassessed. IndecidingwhethertofundHeadStart,legislatorsmustassesswhetherthe programislikelytobebenecialand,ifso,thedegreeofbenet. Wheninvestinginthestockmarket,investorsmustassessthefutureprobabilitydistributionsofstockstheymaybuy. Whenmakingbusinessdecisions,rmsmustassessthefutureprobability distributionsofoutcomes.

PAGE 178

2.5.BAYESIANINFERENCE 165 Weatherforecastersassesstheprobabilityofrain. Publicpolicymakersmustassesswhethertheobservedincreaseinaverage globaltemperatureisanthropogenicand,ifso,towhatextent. Doctorsandpatientsmustassessandcomparethedistributionofoutcomes underseveralalternativetreatments. AttheSlaterSchool,Example2.8,teachersandadministratorsmustassesstheirprobabilitydistributionfor ,thechancethatarandomlyselected teacherdevelopsinvasivecancer. Informationofmanytypesgoesintoassessingprobabilitydistributions.Butitis oftenusefultodividetheinformationintotwotypes:generalbackgroundknowledgeandinformationspecictothesituationathand.Howdothosetwotypes ofinformationcombinetoformanoveralldistributionfor ?Oftenwebeginby summarizingjustthebackgroundinformationas p ,themarginaldistributionof .Thespecicinformationathandisdatawhichwecanmodelas p y 1 ;:::;y n j theconditionaldistributionof y 1 ;:::;y n given .Next,themarginalandconditionaldensitiesarecombinedtogivethejointdistribution p y 1 ;:::;y n ; .Finally, thejointdistributionyields p j y 1 ;:::;y n theconditionaldistributionof given y 1 ;:::;y n .And p j y 1 ;:::;y n representsourstateofknowledgeaccountingfor boththebackgroundinformationandthedataspecictotheproblemathand. p iscalledthe prior distributionand p j y 1 ;:::;y n isthe posterior distribution. Acommonapplicationisinmedicalscreeningexams.Considerapatientbeing screenedforararedisease,onethataffects1in1000people,say.Thedisease rateinthepopulationisbackgroundinformation;thepatient'sresponseonthe screeningexamisdataspecictothisparticularpatient.DeneanindicatorvariableDbyD =1 ifthepatienthasthediseaseandD =0 ifnot.Deneasecond randomvariableTbyT =1 ifthetestresultispositiveandT =0 ifthetest resultisnegative.Andsupposethetestthatis95%accurateinthesensethat P[ T =1 j D =1]=P[ T =0 j D =0]= : 95 .Finally,whatisthechancethat thepatienthasthediseasegiventhatthetestispositve?Inotherwords,whatis P[ D =1 j T =1] ? WehavethemarginaldistributionofDandtheconditionaldistributionofT givenD.Theprocedureistondthejointdistributionof D ; T ,thenthecondi-

PAGE 179

2.5.BAYESIANINFERENCE 166 tionaldistributionofDgivenT.Themathis P[ D =1 j T =1]= P[ D =1 andT =1] P[ T =1] = P[ D =1 andT =1] P[ T =1 andD =1]+P[ T =1 andD =0] = P[ D =1]P[ T =1 j D =1] P[ D =1]P[ T =1 j D =1]+P[ D =0]P[ T =1 j D =0] = : 001 : 95 : 001 : 95+ : 999 : 05 = : 00095 : 00095+ : 04995 : 019 : .9 Thatis,apatientwhotestspositivehasonlyabouta2%chanceofhavingthe disease,eventhoughthetestis95%accurate. Manypeoplendthisasurprisingresultandsuspectamathematicaltrick.Buta quickheuristicchecksaysthatoutof1000peopleweexpect1tohavethedisease, andthatpersontotestpositive;weexpect999peoplenottohavethedisease and5%ofthose,orabout50,totestpositive;soamongthe51peoplewhotest postive,only1,oralittlelessthan2%,hasthedisease.Themathiscorrect.This isanexamplewheremostpeople'sintuitionisatfaultandcarefulattentionto mathematicsisrequiredinordernottobeledastray. Whatisthelikelihoodfunctioninthisexample?Therearetwopossiblevalues oftheparameter,henceonlytwopointsinthedomainofthelikelihoodfunction, D =0 andD =1 .Sothelikelihoodfunctionis ` = : 05; ` = : 95 Here'sanotherwaytolookatthemedicalscreeningproblem,onethathighlights themultiplicativenatureoflikelihood. P[ D =1 j T =1] P[ D =0 j T =1] = P[ D =1 andT =1] P[ D =0 andT =1] = P[ D =1]P[ T =1 j D =1] P[ D =0]P[ T =1 j D =0] = P[ D =1] P[ D =0] P[ T =1 j D =1] P[ T =1 j D =0] = 1 999 : 95 : 05 : 019

PAGE 180

2.5.BAYESIANINFERENCE 167 TheLHSofthisequationistheposterioroddsofhavingthedisease.Thepenultimatelineshowsthattheposterioroddsistheproductoftheprioroddsandthe likelihoodratio.Specically,tocalculatetheposterior,weneedonlythelikelihood ratio,nottheabsolutevalueofthelikelihoodfunction.Andlikelihoodratiosare themeansbywhichprioroddsgettransformedintoposteriorodds. Let'slookmorecarefullyatthemathematicsinthecasewherethedistributions havedensities.Let y denotethedata,eventhoughinpracticeitmightbe y 1 ;:::;y n p j y = p ;y p y = p ;y R p ;y d = p p y j R p p y j d .10 Equation2.10isthesameasEquation2.9,onlyinmoregeneralterms.Sincewe aretreatingthedataasgivenand p j y asafunctionof ,wearejustiedin writing p j y = p ` R p ` d or p j y = p ` c where c = R p ` d isaconstantthatdoesnotdependon .Anintegral withrespectto doesnotdependon ;afterintegrationitdoesnotcontain Theeffectoftheconstant c istorescalethefunctioninthenumeratorsothatit integratesto1.I.e., R p j y d =1 .Andsince c playsthisrole,thelikelihood functioncanabsorbanarbitraryconstantwhichwillultimatelybecompensated forby c .Oneoftenseestheexpression p j y / p ` .11 wheretheunmentionedconstantofproportionalityis c Wecannd c eitherthroughEquation2.10orbyusingEquation2.11,then setting c =[ R p ` d ] )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 .Example2.15illustratesthesecondapproach. Example2.15 Seedlings,continued RecalltheSeedlingsexamples1.4,1.6,1.7,1.9,2.7,and2.13whichmodelledthe numberofNewseedlingarrivalsasPoi .Priortotheexperimentecologistsknew

PAGE 181

2.5.BAYESIANINFERENCE 168 quiteabitaboutregenerationratesof acerrubrum inthevicinityoftheexperimental quadrats.TheyestimatedthatNewseedlingswouldariseataratemostlikelyaround.5 to2seedlingsperquadratperyearandlesslikelyeithermoreorlessthanthat.Their knowledgecouldbeencodedinthepriordensitydisplayedinFigure2.29whichis p = 4 2 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(2 .ThisistheGam ; 1 = 2 density;seeSection5.5.Figure2.29alsodisplays thelikelihoodfunction p y j / 3 e )]TJ/F42 7.9701 Tf 6.586 0 Td [( foundinExample1.4andFigure1.6.Therefore, accordingtoEquation2.11,theposteriordensityis p j y / 5 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(3 .InSection5.5 wewillseethatthisistheGam ; 1 = 3 density,uptoaconstantofproportionality. Therefore c inthisexamplemustbetheconstantthatappearsintheGammadensity: c =1 = [5! = 3 6 ] InFigure2.29theposteriordensityismoresimilartothepriordensitythan tothelikelihoodfunction.Buttheanalysisdealswithonlyasingledatapoint. Let'sseewhathappensasdataaccumulates.Ifwehaveobservations y 1 ;:::;y n ,the likelihoodfunctionbecomes ` = Y p y i j = Y e )]TJ/F42 7.9701 Tf 6.587 0 Td [( y i y i / e )]TJ/F42 7.9701 Tf 6.586 0 Td [(n P y i Toseewhatthismeansinpracticalterms,Figure2.30shows a :thesameprior weusedinExample2.15, b : ` for n =1 ; 4 ; 16 ,and c :theposteriorfor n =1 ; 4 ; 16 ,alwayswith y =3 1.As n increasesthelikelihoodfunctionbecomesincreasinglypeaked.That's becauseas n increases,theamountofinformationabout increases,and weknow withincreasingaccurracy.Thelikelihoodfunctionbecomesincreasinglypeakedaroundthetruevalueof andintervalestimatesbecome increasinglynarrow. 2.As n increasestheposteriordensitybecomesincreasinglypeakedandbecomesincreasinglylike ` .That'sbecauseas n increases,theamountof informationinthedataincreasesandthelikelihoodfunctionbecomesincreasinglypeaked.Meanwhile,thepriordensityremainsasitwas.Eventuallythedatacontainsmuchmoreinformationthantheprior,sothelikelihood functionbecomesmuchmorepeakedthanthepriorandthelikelihooddominates.Sotheposterior,theproductofpriorandlikelihood,looksincreasingly likethelikelihood. Anotherwaytolookatitisthroughtheloglikelihood log ` = c +log p + P n 1 log p y i j .As n !1 thereisanincreasingnumberoftermsinthesum, sothesumeventuallybecomesmuchlargerandmuchmoreimportantthan log p .

PAGE 182

2.5.BAYESIANINFERENCE 169 Figure2.29:Prior,likelihoodandposteriordensitiesfor intheseedlingsexample afterthesingleobservation y =3

PAGE 183

2.5.BAYESIANINFERENCE 170 Inpractice,ofcourse, y usuallydoesn'tremainconstantas n increases.Wesaw inExample1.6thattherewere40newseedlingsin60quadrats.Withthisdata theposteriordensityis p j y 1 ;:::;y 60 / 42 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(62 .12 whichistheGam ; 1 = 62 density.ItispicturedinFigure2.31.Compareto Figure2.29. Example2.16showsBayesianstatisticsatworkfortheSlaterSchool.See Lavine[1999]forfurtheranalysis. Example2.16 SlaterSchool,cont. AtthetimeoftheanalysisreportedinBrodeur[1992]thereweretwootherlinesof evidenceregardingtheeectofpowerlinesoncancer.First,thereweresomeepidemiologicalstudiesshowingthatpeoplewholivenearpowerlinesorwhoworkaspower linerepairmendevelopcancerathigherratesthanthepopulationatlarge,thoughonly slightlyhigher.Andsecond,chemistsandphysicistswhocalculatethesizeofmagnetic eldsinducedbypowerlinesthesupposedmechanismforinducingcancersaidthatthe smallamountofenergyinthemagneticeldsisinsucienttohaveanyappreciableaect onthelargebiologicalmoleculesthatareinvolvedincancergenesis.Thesetwolinesof evidencearecontradictory.Howshallweassessadistributionfor ,theprobabilitythat ateacherhiredatSlaterSchooldevelopscancer? Recallfrompage137thatNeutra,thestateepidemiologist,calculated.2casesof cancercouldhavebeenexpectedtooccurifthecancerrateatSlaterwereequalto thenationalaverage.Therefore,thenationalaveragecancerrateforwomenoftheage typicalofSlaterteachersis 4 : 2 = 145 : 03 .Consideringtheviewofthephysicists,our priordistributionshouldhaveafairbitofmassonvaluesof : 03 .Andconsideringthe epidemiologicalstudiesandthelikelihoodthateectswouldhavebeendetectedbefore 1992iftheywerestrong,ourpriordistributionshouldputmostofitsmassbelow : 06 Forthesakeofargumentlet'sadoptthepriordepictedinFigure2.32.Itsformulais p = \05020\050400 \050420 19 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 399 .13 whichwewillseeinSection5.6istheBe ; 400 density.Thelikelihoodfunction is ` / 8 )]TJ/F41 11.9552 Tf 13.087 0 Td [( 137 Equation2.3,Figure2.20.Thereforetheposteriordensity p j y / 27 )]TJ/F41 11.9552 Tf 13.057 0 Td [( 536 whichwewillseeinSection5.6istheBe ; 537 density. Thereforewecaneasilywritedowntheconstantandgettheposteriordensity p j y = \05028\050537 \050565 27 )]TJ/F41 11.9552 Tf 11.956 0 Td [( 536 whichisalsopicturedinFigure2.32.

PAGE 184

2.5.BAYESIANINFERENCE 171 Figure2.30: a :Prior, b :likelihoodand c :posteriordensitiesfor with n =1 ; 4 ; 16

PAGE 185

2.5.BAYESIANINFERENCE 172 Figure2.31:Prior,likelihoodandposteriordensitiesfor with n =60 P y i =40 .

PAGE 186

2.5.BAYESIANINFERENCE 173 Figure2.32:Prior,likelihoodandposteriordensityforSlaterSchool

PAGE 187

2.6.PREDICTION 174 Examples2.15and2.16havetheconvenientfeaturethatthepriordensityhad thesameform a e )]TJ/F42 7.9701 Tf 6.586 0 Td [(b inonecaseand a )]TJ/F41 11.9552 Tf 11.261 0 Td [( b intheotherasthelikelihood function,whichmadetheposteriordensityandtheconstant c particularlyeasy tocalculate.Thiswasnotacoincidence.Theinvestigatorsknewtheformofthe likelihoodfunctionandlookedforaconvenientpriorofthesameformthatapproximatelyrepresentedtheirpriorbeliefs.Thisconvenience,andwhetherchoosingapriordensityforthispropertyislegitimate,aretopicswhichdeserveserious thoughtbutwhichweshallnottakeupatthispoint. 2.6Prediction Sometimesthegoalofstatisticalanalysisistomakepredictionsforfutureobservations.Let y 1 ;:::;y n ;y f beasamplefrom p j .Weobserve y 1 ;:::;y n butnot y f andwantapredictionfor y f .Therearethreecommonformsthatpredictionstake. pointpredictions Apointpredictionisasingleguessfor y f .Itmightbea predictivemean predictivemedian predictivemode ,oranyothertypeofpoint predictionthatseemssensible. intervalpredictions Anintervalpredictionor predictiveinterval ,isanintervalof plausiblevaluesfor y f .Apredictiveintervalisaccompaniedbyaprobability.Forexample,wemightsaythatTheinterval ; 5 isa90%predictive intervalfor y f whichwouldmean Pr[ y f 2 ; 5]= : 90 .Inagivenproblem thereare,fortworeasons,manypredictiveintervals.First,thereare90%intervals,95%intervals,50%intervals,andsoon.Andsecond,therearemany predictiveintervalswiththesameprobability.Forinstance,if ; 5 isa90% predictiveinterval,thenit'spossiblethat )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 ; 4 : 5 isalsoa90%predictive interval. predictivedistributions Apredictivedistributionisaprobabilitydistributionfor y f .Fromapredictivedistribution,differentpeoplecouldcomputepointpredictionsorintervalpredictions,eachaccordingtotheirneeds. Intherealworld,wedon'tknow .Afterall,that'swhywecollecteddata y 1 ;:::;y n Butfornow,toclarifythetypesofpredictionslistedabove,let'spretendthatwe doknow .Specically,let'spretendthatweknow y 1 ;:::;y n ;y f i.i.d.N )]TJ/F15 11.9552 Tf 9.298 0 Td [(2 ; 1 Themainthingtonote,sinceweknow inthiscase,themeanandSDofthe Normaldistribution,isthat y 1 ;:::;y n don'thelpusatall.Thatis,theycontainno

PAGE 188

2.6.PREDICTION 175 informationabout y f thatisnotalreadycontainedintheknowledgeof .Inother words, y 1 ;:::;y n and y f areconditionallyindependentgiven .Insymbols: p y f j ;y 1 ;:::;y n = p y f j : Therefore,ourpredictionshouldbebasedontheknowledgeof alone,notonany aspectof y 1 ;:::;y n Asensiblepointpredictionfor y f is ^ y f = )]TJ/F15 11.9552 Tf 9.298 0 Td [(2 ,because-2isthemean,median, andmodeoftheN )]TJ/F15 11.9552 Tf 9.298 0 Td [(2 ; 1 distribution.Somesensible90%predictionintervals are ; )]TJ/F15 11.9552 Tf 9.298 0 Td [(0 : 72 )]TJ/F15 11.9552 Tf 9.298 0 Td [(3 : 65 ; )]TJ/F15 11.9552 Tf 9.299 0 Td [(0 : 36 and )]TJ/F15 11.9552 Tf 9.299 0 Td [(3 : 28 ; 1 .Wewouldchooseoneorthe otherdependingonwhetherwewantedtodescribethelowestvaluesthat y f might take,amiddlesetofvalues,orthehighestvalues.And,ofcourse,thepredictive distributionof y f isN )]TJ/F15 11.9552 Tf 9.298 0 Td [(2 ; 1 .Itcompletelydescribestheextentofourknowledge andabilitytopredict y f Inrealproblems,though,wedon'tknow .Thesimplestwaytomakeapredictionconsistsoftwosteps.Firstuse y 1 ;:::;y n toestimate ,thenmakepredictions basedon p y f j ^ .Predictionsmadebythismethodarecalled plug-in predictions. Intheexampleofthepreviousparagraph,if y 1 ;:::;y n yielded ^ = )]TJ/F15 11.9552 Tf 9.299 0 Td [(2 and ^ =1 thenpredictionswouldbeexactlyasdescribedabove. Foranexamplewithdiscretedata,refertoExamples1.4and1.6inwhich is thearrivalrateofnewseedlings.Wefound ^ =2 = 3 .Theentireplug-inpredictive distributionisdisplayedinFigure2.33. ^ y f =0 isasensiblepointprediction. Theset f 0 ; 1 ; 2 g isa97%plug-inpredictionintervalorpredictionsetbecause ppois,2/3 : 97 ;theset f 0 ; 1 ; 2 ; 3 g isa99.5%interval. Therearetwosourcesofuncertaintyinmakingpredictions.First,because y f is random,wecouldn'tpredictitperfectlyevenifweknew .Andsecond,wedon't know .Inanygivenproblem,eitheroneofthetwomightbethemoreimportant sourceofuncertainty.Thersttypeofuncertaintycan'tbeeliminated.Butin theory,thesecondtypecanbereducedbycollectinganincreasinglylargesample y 1 ;:::;y n sothatweknow withevermoreaccuracy.Eventually,whenweknow accuratelyenough,thesecondtypeofuncertaintybecomesnegligiblecompared totherst.Inthatsituation,plug-inpredictionsdocapturealmostthefullextent ofpredictiveuncertainty. Butinmanypracticalproblemsthesecondtypeofuncertaintyistoolargetobe ignored.Plug-inpredictiveintervalsandpredictivedistributionsaretoooptimistic becausetheydon'taccountfortheuncertaintyinvolvedinestimating .ABayesian approachtopredictioncanaccountforthisuncertainty.Thepriordistributionof andtheconditionaldistributionof y 1 ;:::;y n ;y f given providethefulljoint distributionof y 1 ;:::;y n ;y f ; ,whichinturnprovidestheconditionaldistribution

PAGE 189

2.6.PREDICTION 176 Figure2.33:Plug-inpredictivedistribution y f Poi =2 = 3 fortheseedlings example of y f given y 1 ;:::;y n .Specically, p y f j y 1 ;:::;y n = Z p y f ; j y 1 ;:::;y n d = Z p j y 1 ;:::;y n p y f j ;y 1 ;:::;y n d = Z p j y 1 ;:::;y n p y f j d .14 Equation2.14isjustthe y f marginaldensityderivedfromthejointdensityof ;y f ,alldensitiesbeingconditionalonthedataobservedsofar.Tosayitanother way,thepredictivedensity p y f is R p ;y f d = R p p y f j d ,butwhere p isreallytheposterior p j y 1 ;:::;y n .Theroleof y 1 ;:::;y n istogiveustheposteriordensityof insteadoftheprior. ThepredictivedistributioninEquation2.14willbesomewhatmoredispersed thantheplug-inpredictivedistribution.Ifwedon'tknowmuchabout thenthe posteriorwillbewidelydispersedandEquation2.14willbemuchmoredispersed thantheplug-inpredictivedistribution.Ontheotherhand,ifweknowalotabout thentheposteriordistributionwillbetightandEquation2.14willbeonlyslightly

PAGE 190

2.6.PREDICTION 177 moredispersedthantheplug-inpredictivedistribution. Example2.17 Seedlings,cont. RefertoExamples1.4and2.15about Y ,thenumberofnewseedlingsemergingeachyear inaforestquadrat.Ourmodelis Y Poi .Thepriorpage167was p =4 2 e )]TJ/F39 7.9701 Tf 6.586 0 Td [(2 Beforecollectinganydataourpredictivedistributionwouldbebasedonthatprior.For anynumber y wecouldcalculate p Y f y P[ Y f = y ]= Z p Y f j y j p d = Z y e )]TJ/F42 7.9701 Tf 6.586 0 Td [( y 2 3 \0503 2 e )]TJ/F39 7.9701 Tf 6.586 0 Td [(2 d = 2 3 y !\0503 Z y +2 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(3 d = 2 3 \050 y +3 y !\0503 y +3 Z 3 y +3 \050 y +3 y +2 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(3 d = y +2 y 2 3 3 1 3 y ; .15 WewillseeinChapter5thatthisisaNegativeBinomialdistribution.Thusforexample, accordingtoourprior, Pr[ Y f =0]= 2 3 3 = 8 27 Pr[ Y f =1]=3 2 3 3 1 3 = 8 27 etc. Figure2.34displaystheseprobabilities. Intherstquadratwefound y 1 =3 andtheposteriordistributionExample2.15, pg.167 p j y 1 =3= 3 6 5! 5 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(= 3 : So,bycalculationssimilartoEquation2.15,thepredictivedistributionafterobserving y 1 =3 is p Y f j Y 1 y j y 1 =3= Z p Y f j y j p j Y 1 j y 1 =3 d = y +5 y 3 4 6 1 4 y .16

PAGE 191

2.7.HYPOTHESISTESTING 178 So,forexample, Pr[ Y f =0 j y 1 =3]= 3 4 6 Pr[ Y f =1 j y 1 =3]=6 3 4 6 1 4 etc. Figure2.34displaystheseprobabilities. Finally,whenwecollecteddatafrom60quadrats,wefound p j y 1 ;:::;y 60 = 62 43 42! 42 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(62 .17 Therefore,bycalculationssimilartoEquation2.15,thepredictivedistributionis Pr[ Y f = y j y 1 ;:::;y 60 ]= y +42 y 62 63 6 1 63 y .18 Figure2.34displaystheseprobabilities. Apriori,andafteronly n =1 observation, isnotknowveryprecisely;bothtypesof uncertaintyareimportant;andtheBayesianpredictivedistributionisnoticablydierent fromtheplug-inpredictivedistribution.Butafter n =60 observations isknownfairly well;thesecondtypeofuncertaintyisnegligible;andtheBayesianpredictivedistribution isverysimilartotheplug-inpredictivedistribution. 2.7HypothesisTesting Scienticinquiryoftentakestheformofhypothesistesting.Ineachinstancethere aretwohypothesesthe null hypothesisH 0 andthe alternative hypothesisH a medicine H 0 :thenewdrugandtheolddrugareequallyeffective. H a :thenewdrugisbetterthantheold. publichealth H 0 exposuretohighvoltageelectriclinesisbenign.

PAGE 192

2.7.HYPOTHESISTESTING 179 Figure2.34:Predictivedistributionsof y f intheseedlingsexampleaftersamples ofsize n =0 ; 1 ; 60 ,andtheplug-inpredictive

PAGE 193

2.7.HYPOTHESISTESTING 180 H a exposuretohighvoltageelectriclinespromotescancer. publicpolicy H 0 :HeadStarthasnoeffect. H a :HeadStartisbenecial. astronomy H 0 :ThesunrevolvesaroundtheEarth. H a :TheEarthrevolvesaroundthesun. physics H 0 :Newtonianmechanicsholds. H a :Relativityholds. publictrust H 0 :Winninglotterynumbersarerandom. H a :Winninglotterynumbershavepatterns. ESP H 0 :ThereisnoESP. H a :ThereisESP. ecology H 0 :Forestresareirrelevanttoforestdiversity. H a :Forestresenhanceforestdiversity. BytraditionH 0 isthehypothesisthatsaysnothinginterestingisgoingonorthe currenttheoryiscorrect,whileH a saysthatsomethingunexpectedishappeningor ourcurrenttheoriesneedupdating.Oftentheinvestigatorishopingtodisprove thenullhypothesisandtosuggestthealternativehypothesisinitsplace. Itisworthnotingthatwhilethetwohypothesesarelogicallyexclusive,they arenotlogicallyexhaustive.Forinstance,it'slogicallypossiblethatforestres decreasediversityeventhoughthatpossibilityisnotincludedineitherhypothesis. SoonecouldwriteH a : Forestresdecreaseforestdiversity ,orevenH a : Forestres

PAGE 194

2.7.HYPOTHESISTESTING 181 changeforestdiversity .Whichalternativehypothesisischosenmakeslittledifferenceforthetheoryofhypothesistesting,thoughitmightmakealargedifference toecologists. Statisticianshavedevelopedseveralmethodscalledhypothesistests.Wefocus onjustoneforthemoment,usefulwhenH 0 isspecic.Thefundamentalideais toseewhetherthedataarecompatiblewiththespecicH 0 .Ifso,thenthere isnoreasontodoubtH 0 ;ifnot,thenthereisreasontodoubtH 0 andpossiblyto considerH a initsstead.Themeaningofcompatiblecanchangefromproblemto problembuttypicallythereisafourstepprocess. 1.Formulateascienticnullhypothesisandtranslateitintostatisticalterms. 2.Choosealowdimensionalstatistic,say w = w y 1 ;:::;y n suchthatthedistributionof w isspeciedunderH 0 andlikelytobedifferentunderH a 3.Calculate,oratleastapproximate,thedistributionof w underH 0 4.Checkwhethertheobservedvalueof w ,calculatedfrom y 1 ;:::;y n ,iscompatiblewithitsdistributionunderH 0 Howwouldthisworkintheexampleslistedatthebeginningofthechapter? Whatfollowsisaverybriefdescriptionofhowhypothesistestsmightbecarried outinsomeofthoseexamples.Tofocusonthekeyelementsofhypothesistesting, thedescriptionshavebeenkeptoverlysimplistic.Inpractice,wewouldhaveto worryaboutconfoundingfactors,thedifcultiesofrandomsampling,andmany otherissues. publichealth Samplealargenumberofpeoplewithhighexposuretopowerlines. Foreachperson,record X i ,aBernoullirandomvariableindicatingwhether thatpersonhascancer.Model X 1 ;:::;X n i.i.d.Bern 1 .Repeatforasampleofpeoplewithlowexposure;getting Y 1 ;:::;Y n i.i.d.Bern 2 .Estimate 1 and 2 .Let w = ^ 1 )]TJ/F15 11.9552 Tf 12.803 3.154 Td [(^ 2 .H 0 says E [ w ]=0 .EithertheBinomialdistribution ortheCentralLimitTheoremtellsustheSD'sof ^ 1 and ^ 2 ,andhencetheSD of w .Ask HowmanySD'sis w awayfromitsexpectedvalueof0 .Ifit'soffby manySD's,morethanabout2or3,that'sevidenceagainstH 0 publicpolicy TestasamplechildrenwhohavebeenthroughHeadStart.Model theirtestscoresas X 1 ;:::;X n i.i.d.N 1 ; 1 .Dothesameforchildren whohavenotbeenthroughHeadStart,getting Y 1 ;:::;Y n i.i.d.N 2 ; 2 H 0 says 1 = 2 .Let w =^ 1 )]TJ/F15 11.9552 Tf 13.512 0 Td [(^ 2 .Theparameters 1 ; 2 ; 1 ; 2 canallbe estimatedfromthedata;therefore w canbecalculatedanditsSDestimated.

PAGE 195

2.7.HYPOTHESISTESTING 182 Ask HowmanySD'sis w awayfromitsexpectedvalueof0 .Ifit'soffbymany SD's,morethanabout2or3,that'sevidenceagainstH 0 ecology Wecouldeitherdoanobservationalstudy,beginningwithonesampleof plotsthathadhadfrequentforestresinthepastandanothersamplethat hadhadfewres.Orwecoulddoanexperimentalstudy,beginningwitha largecollectionofplotsandsubjectinghalftoaregimeofregularburningand theotherhalftoaregimeofnoburning.Ineithercasewewouldmeasure andcomparespeciesdiversityinbothsetsofplots.Ifdiversityissimilar inbothgroups,thereisnoreasontodoubtH 0 .Butifdiversityissufciently differentSufcientmeans largecomparedtowhatisexpectedbychanceunder H 0 .thatwouldbeevidenceagainstH 0 Toillustrateinmoredetail,let'sconsidertestinganewbloodpressuremedication.Thescienticnullhypothesisisthatthenewmedicationisnotanymore effectivethantheold.We'llconsidertwowaysastudymightbeconductedandsee howtotestthehypothesisbothways. M ETHOD 1Alargenumberofpatientsareenrolledinastudyandtheirblood pressuresaremeasured.Halfarerandomlychosentoreceivethenewmedication treatment;halfreceivetheoldcontrol.Afteraprespeciedamountoftime, theirbloodpressureisremeasured.Let Y C;i bethechangeinbloodpressurefrom thebeginningtotheendoftheexperimentforthe i 'thcontrolpatientand Y T;i be thechangeinbloodpressurefromthebeginningtotheendoftheexperimentfor the i 'thtreatmentpatient.Themodelis Y C; 1 ;:::Y C;n i.i.d. f C ; E [ Y C;i ]= C ;Var Y C;i = 2 C Y T; 1 ;:::Y T;n i.i.d. f T ; E [ Y T;i ]= T ;Var Y T;i = 2 T forsomeunknownmeans C and T andvariances C and T .Thetranslationof thehypothesesintostatisticaltermsis H 0 : T = C H a : T 6 = C Becausewe'retestingadifferenceinmeans,let w = Y T )]TJ/F15 11.9552 Tf 12.437 3.022 Td [( Y C .Ifthesamplesize n is reasonablylarge,thentheCentralLimitTheoremsaysapproximately w N ; 2 w underH 0 with 2 w = 2 T + 2 C =n .Themeanof0comesfromH 0 .Thevariance 2 w comesfromaddingvariancesofindependentrandomvariables. 2 T and 2 C and therefore 2 w canbeestimatedfromthedata.Sowecancalculate w fromthedata

PAGE 196

2.7.HYPOTHESISTESTING 183 andseewhetheritiswithinabout2or3SD'sofwhereH 0 saysitshouldbe.Ifit isn't,that'sevidenceagainstH 0 M ETHOD 2Alargenumberofpatientsareenrolledinastudyandtheirblood pressureismeasured.Theyarematchedtogetherinpairsaccordingtorelevant medicalcharacteristics.Thetwopatientsinapairarechosentobeassimilarto eachotheraspossible.Ineachpair,onepatientisrandomlychosentoreceivethe newmedicationtreatment;theotherreceivestheoldcontrol.Afteraprespeciedamountoftimetheirbloodpressuresaremeasuredagain.Let Y T;i and Y C;i be thechangeinbloodpressureforthe i 'thtreatmentand i 'thcontrolpatients.The researcherrecords X i = 1 if Y T;i >Y C;i 0 otherwise Themodelis X 1 ;:::;X n i.i.d.Bern p forsomeunknownprobability p .Thetranslationofthehypothesesintostatistical termsis H 0 : p = : 5 H a : p 6 = : 5 Let w = P X i .UnderH 0 w Bin n;: 5 .TotestH 0 weplottheBin n;: 5 distributionandseewhere w fallsontheplot.Figure2.35showstheplotfor n =100 .If w turnedouttobebetweenabout40and60,thentherewouldbelittlereasonto doubtH 0 .Butontheotherhand,if w turnedouttobelessthan40orgreaterthan 60,thenwewouldbegintodoubt.Thelarger j w )]TJ/F15 11.9552 Tf 12.365 0 Td [(50 j ,thegreaterthecausefor doubt. Thisbloodpressureexampleexhibitsafeaturecommontomanyhypothesis tests.First,we'retestingadifferenceinmeans.I.e.,H 0 andH a disagreeabouta mean,inthiscasethemeanchangeinbloodpressurefromthebeginningtothe endoftheexperiment.Sowetake w tobethedifferenceinsamplemeans.Second, sincetheexperimentisrunonalargenumberofpeople,theCentralLimitTheorem saysthat w willbeapproximatelyNormallydistributed.Third,wecancalculateor estimatethemean 0 andSD 0 underH 0 .Sofourth,wecancomparethevalueof w fromthedatatowhatH 0 saysitsdistributionshouldbe. InMethod1above,that'sjustwhatwedid.InMethod2above,wedidn't usetheNormalapproximation;weusedtheBinomialdistribution.Butwecould haveusedtheapproximation.FromfactsabouttheBinomialdistributionweknow

PAGE 197

2.7.HYPOTHESISTESTING 184 Figure2.35:pdfoftheBin ;: 5 distribution 0 = n= 2 and 0 = p n= 2 underH 0 .For n =100 ,Figure2.36comparestheexact BinomialdistributiontotheNormalapproximation. Ingeneral,whentheNormalapproximationisvalid,wecompare w tothe N 0 ; 0 density,where 0 iscalculatedaccordingtoH 0 and 0 iseithercalculated accordingtoH 0 orestimatedfromthedata.If t j w )]TJ/F41 11.9552 Tf 11.781 0 Td [( 0 j = 0 isbiggerthanabout 2or3,that'sevidenceagainstH 0 Thefollowingexampleshowshypothesistestingatwork. Example2.18 ToothGrowth,continued ThiscontinuesExample2.1pg.100.Let'sconcentrateonaparticulardosage,say dose =0 : 5 ,andtestthenullhypothesisthat,onaverage,thedeliverymethod supp makesnodierencetotoothgrowth,asopposedtothealternativethatitdoesmake adierence.Thosearethescientichypotheses.Thedatafortestingthehypothesis are x 1 ;:::;x 10 ,the10recordingsofgrowthwhen supp = VC and y 1 ;:::;y 10 ,the10 recordingsofgrowthwhen supp = OJ .The x i 'sare10independentdrawsfromone distribution;the y i 'sare10independentdrawsfromanother: x 1 ;:::;x 10 i.i.d. f VC y 1 ;:::;y 10 i.i.d. f OJ

PAGE 198

2.7.HYPOTHESISTESTING 185 Figure2.36:pdfsoftheBin ;: 5 dotsandN ; 5 linedistributions Denethetwomeanstobe VC E [ x i ] and OJ E [ y i ] .Thescientichypothesisand itsalternative,translatedintostatisticaltermsbecome H 0 : VC = OJ H a : VC 6 = OJ Thosearethehypothesesinstatisticalterms. Becausewe'retestingadierenceinmeans,wechooseouronedimensionalsummary statistictobe w = j x )]TJ/F15 11.9552 Tf 12.714 0 Td [( y j .Smallvaluesof w supportH 0 ;largevaluessupportH a .But howsmallissmall;howlargeislarge?TheCentralLimitTheoremsays x N VC ; VC p n y N OJ ; OJ p n

PAGE 199

2.7.HYPOTHESISTESTING 186 approximately,sothatunderH 0 w N 0 ; r 2 VC + 2 OJ n ; approximately.Thestatistic w canbecalculated,its SD estimated,anditsapproximate densityplottedasinFigure2.37.WecanseefromtheFigure,orfromthefactthat t= t 3 : 2 thattheobservedvalueof t ismoderatelyfarfromitsexpectedvalueunder H 0 .ThedataprovidemoderatelystrongevidenceagainstH 0 Figure2.37:Approximatedensityofsummarystatistic t .Theblackdotisthevalue of t observedinthedata. Figure2.37wasproducedwiththefollowing R code. x<-ToothGrowth$len[ToothGrowth$supp=="VC" &ToothGrowth$dose==0.5] y<-ToothGrowth$len[ToothGrowth$supp=="OJ" &ToothGrowth$dose==0.5] t<-absmeanx-meany sd<-sqrtvarx+vary/lengthx

PAGE 200

2.7.HYPOTHESISTESTING 187 tvals<-seq-4*sd,4*sd,len=80 plottvals,dnormtvals,0,sd,type="l", xlab="",ylab="",main="" pointst,0,pch=16,cex=1.5 The points... addstheobservedvalueof t totheplot. Inthenextexampleitisdifculttoestimatethedistributionof w underH 0 ;so weusesimulationtoworkitout. Example2.19 Baboons Becausebaboonsarepromiscuous,whenababyisbornitisnotobvious,atleastto humans,whothefatheris.Butdothebaboonsthemselvesknowwhothefatheris? Buchanetal.[2003]reportastudyofbaboonbehaviorthatattemptstoanswerthat question.Formoreinformationsee http://www.princeton.edu/~baboon. Baboons liveinsocialgroupscomprisedofseveraladultmales,severaladultfemales,andjuveniles. Researchersfollowedseveralgroupsofbaboonsperiodicallyoveraperiodofseveralyears tolearnaboutbaboonbehavior.Theparticularaspectofbehaviorthatconcernsushere isthatadultmalessometimescometotheaidofjuveniles.Ifadultmalesknowwhich juvenilesaretheirownchildren,thenit'satleastpossiblethattheytendtoaidtheirown childrenmorethanotherjuveniles.Thedataset baboons availableonthewebsite 1 containsdataonalltherecordedinstancesofadultmaleshelpingjuveniles.Therst fourlinesofthelelooklikethis. RecipFatherMaleallyDadpresentGroup ABBEDWEDWYOMO ABBEDWEDWYOMO ABBEDWEDWYOMO ABBEDWPOWYOMO 1. Recip identiesthejuvenilewhoreceivedhelp.Inthefourlinesshownhere,itis alwaysABB. 2. Father identifesthefatherofthejuvenile.Researchersknowthefatherthrough DNAtestingoffecalsamples.Inthefourlinesshownhere,itisalwaysEDW. 1 Wehaveslightlymodiedthedatatoavoidsomeirrelevantcomplications.

PAGE 201

2.7.HYPOTHESISTESTING 188 3. Maleally identiestheadultmalewhohelpedthejuvenile.Inthefourthlinewe seethatPOWaidedABBwhoisnothisownchild. 4. Dadpresent tellswhetherthefatherwaspresentinthegroupwhenthejuvenile wasaided.InthisdatasetitisalwaysY. 5. Group identiesthesocialgroupinwhichtheincidentoccured.Inthefourlines shownhere,itisalwaysOMO. Let w bethenumberofcasesinwhichafatherhelpshisownchild.Thesnippet dimbaboons sumbaboons$Father==baboons$Maleally revealsthatthereare n =147 casesinthedataset,andthat w =87 arecasesinwhich afatherhelpshisownchild.Thenextstepistoworkoutthedistributionof w under H 0 :adultmalebaboonsdonotknowwhichjuvenilesaretheirchildren Let'sexamineonegroupmoreclosely,saythe OMO group.Typing baboons[baboons$Group=="OMO",] displaystherelevantrecords.Thereare13ofthem.EDWwasthefatherin9,POW wasthefatherin4.EDWprovidedthehelpin9,POWin4.Thefatherwastheally in9cases;in4hewasnot.H 0 impliesthatEDWandPOWwoulddistributetheirhelp randomlyamongthe13cases.IfH 0 istrue,i.e.,ifEDWdistributeshis9helpsandPOW distributeshis4helpsrandomlyamongthe13cases,whatwouldbethedistributionof W ,thenumberoftimesafatherhelpshisownchild?Wecananswerthatquestion byasimulationin R .Wecouldalsoansweritbydoingsomemathorbyknowingthe hypergeometricdistribution,butthat'snotcoveredinthistext. dads<-baboons$Father[baboons$Group=="OMO"] ally<-baboons$Maleally[baboons$Group=="OMO"] N.sim<-1000 w<-repNA,N.sim foriin1:N.sim{ perm<-sampledads w[i]<-sumperm==ally } histw tablew

PAGE 202

2.7.HYPOTHESISTESTING 189 Tryoutthesimulationforyourself.Itshowsthattheobservednumberinthedata, w =9 ,isnotsounusualunderH 0 Whatabouttheothersocialgroups?Ifwendouthowmanythereare,wecando asimilarsimulationforeach.Let'swritean R functiontohelp. g.sim<-functiongroup,N.sim{ dads<-baboons$Father[baboons$Group==group] ally<-baboons$Maleally[baboons$Group==group] w<-repNA,N.sim foriin1:N.sim{ perm<-sampledads w[i]<-sumperm==ally } returnw } Figure2.38showshistogramsof g.sim foreachgroup,alongwithadotshowingthe observedvalueof w inthedataset.Forsomeofthegroupstheobservedvalueof w thoughabitonthehighside,mightbeconsideredconsistentwithH 0 .Forothers,the observedvalueof w fallsoutsidetherangeofwhatmightbereasonablyexpectedby chance.Inacaselikethis,wheresomeoftheevidenceisstronglyagainstH 0 andsome isonlyweaklyagainstH 0 ,aninexperiencedstatisticianmightbelievetheoverallcase againstH 0 isnotverystrong.Butthat'snottrue.Infact,everyoneofthegroups contributesalittleevidenceagainstH 0 ,andthetotalevidenceagainstH 0 isverystrong. Toseethis,wecancombinetheseparatesimulationsintoone.Thefollowingsnippetof codedoesthis.Eachmale'shelpisrandomlyreassignedtoajuvenilewithinhisgroup. Thenumberoftimeswhenafatherhelpshisownchildissummedoverthedierent groups.SimulatednumbersareshowninthehistograminFigure2.39.Thedotinthe gureisat84,theactualnumberofinstancesinthefulldataset.Figure2.39suggests thatitisalmostimpossiblethatthe84instancesarosebychance,asH 0 wouldsuggest. WeshouldrejectH 0 andreachtheconclusionthat a adultmalebaboonsdoknowwho theirownchildrenare,and b theygivehelppreferentiallytotheirownchildren. Figure2.38wasproducedwiththefollowingsnippet. groups<-uniquebaboons$Group n.groups<-lengthgroups parmfrow=c,2

PAGE 203

2.7.HYPOTHESISTESTING 190 Figure2.38:NumberoftimesbaboonfatherhelpsownchildinExample2.19. HistogramsaresimulatedaccordingtoH 0 .Dotsareobserveddata.

PAGE 204

2.7.HYPOTHESISTESTING 191 foriin1:n.groups{ good<-baboons$Group==groups[i] w.obs<-sumbaboons$Father[good] ==baboons$Maleally[good] w.sim<-g.simgroups[i],N.sim histw.sim,xlab="w",ylab="",main=groups[i], xlim=rangecw.obs,w.sim pointsw.obs,0,pch=16,cex=1.5 printw.obs } Figure2.39:Histogramofsimulatedvaluesofw.tot.Thedotisthevalueobserved inthebaboondataset. Figure2.39wasproducedwiththefollowingsnippet. w.obs<-repNA,n.groups w.sim<-matrixNA,n.groups,N.sim

PAGE 205

2.8.EXERCISES 192 foriin1:n.groups{ good<-baboons$Group==groups[i] w.obs[i]<-sumbaboons$Father[good] ==baboons$Maleally[good] w.sim[i,]<-g.simgroups[i],N.sim } w.obs.tot<-sumw.obs w.sim.tot<-applyw.sim,2,sum histw.sim.tot,xlab="w.tot",ylab="", xlim=rangecw.obs.tot,w.sim.tot pointsw.obs.tot,0,pch=16,cex=1.5 printw.obs.tot 2.8Exercises 1.aJustifyEquation2.1onpage106. bShowthatthefunction g x denedjustafterEquation2.1isaprobabilitydensity.I.e.,showthatitintegratesto1. 2.Thisexerciseusesthe ToothGrowth datafromExamples2.1and2.18. aEstimatetheeffectofdeliverymodefordoses1.0and2.0.Doesitseem thatdeliverymodehasadifferenteffectatdifferentdoses? bDoesitseemasthoughdeliverymodechangestheeffectofdose? cForeachdeliverymode,makeasetofthreeboxplotstocomparethe threedoses. 3.Thisexerciseusesdatafrom272eruptionsoftheOldFaithfulgeyserinYellowstoneNationalPark.Thedataareinthe R dataset faithful .Onecolumn containsthedurationofeacheruption;theothercontainsthewaitingtime tothenexteruption. aPlot eruption versus waiting .Isthereapattern?Whatisgoingon? bTry ts.plotfaithful$eruptions[1:50] .Tryothersetsoferuptions, say ts.plotfaithful$eruptions[51:100] .Thereisnothingmagic

PAGE 206

2.8.EXERCISES 193 about50,butifyouplotall272eruptionsthenthepatternmightbe hardertosee.Chooseanyconvenientnumberthatletsyouseewhat's goingon.Whatisgoingon? 4.Thisexercisereliesondatafromtheneurobiologyexperimentdescribedin Example2.6. aDownloadthedatafromthebook'swebsite. bReproduceFigure2.17. cMakeaplotsimilartoFigure2.17butforadifferentneuronanddifferenttastant. dWritean R functionthatacceptsaneuronandtastantasinputandproducesaplotlikeFigure2.17. eUsethefunctionfromthepreviousparttolookforneuronsthatrespond toparticulartastants.Describeyourresults. 5.ThisexercisereliesonExample2.8abouttheSlaterschool.Therewere8cancersamong145teachers.Figure2.20showsthelikelihoodfunction.Suppose thesameincidenceratehadbeenfoundamongmoreteachers.Howwould thataffect ` ?MakeaplotsimilartoFigure2.20,butpretendingthatthere hadbeen80cancersamong1450teachers.ComparetoFigure2.20.Whatis theresult?Doesitmakesense?Tryothernumbersifithelpsyouseewhatis goingon. 6.ThisexercisecontinuesExercise35inChapter1.Let p bethefractionofthe populationthatusesillegaldrugs. aSupposeresearchersknowthat p : 1 .JaneandJohnaregiventhe randomizedresponsequestion.Janeanswersyes;Johnanswersno. FindtheposteriorprobabilitythatJaneusescocaine;ndtheposterior probabilitythatJohnusescocaine. bNowsupposethat p isnotknownandtheresearchersgivetherandomizedresponsequestionto100people.Let X bethenumberwhoanswer yes.Whatisthelikelihoodfunction? cWhatisthemleof p ifX=50,ifX=60,ifX=70,ifX=80,ifX=90? 7.ThisexercisedealswiththelikelihoodfunctionforPoissondistributions. aLet x 1 ;:::;x n i.i.d.Poi .Find ` intermsof x 1 ;:::;x n .

PAGE 207

2.8.EXERCISES 194 bShowthat ` dependsonlyon P x i andnotonthespecicvaluesof theindividual x i 's. cLet y 1 ;:::;y n beasamplefromPoi .Showthat ^ = y isthem.l.e. dFindthem.l.e.inExample1.4. 8.Thebook Data AndrewsandHerzberg[1985]containslotsofdatasetsthat havebeenusedforvariouspurposesinstatistics.Onefamousdatasetrecords theannualnumberofdeathsbyhorsekicksinthePrussianArmyfrom18751894foreachof14corps.Downloadthedatafrom statlib at http://lib. stat.cmu.edu/datasets/Andrews/T04.1 .ItisTable4.1inthebook.Let Y ij bethenumberofdeathsinyear i ,corps j ,for i =1875 ;:::; 1894 and j =1 ;:::; 14 .The Y ij sareincolumns5ofthetable. aWhataretherstfourcolumnsofthetable? bWhatisthelastcolumnofthetable? cWhatisagoodmodelforthedata? dSupposeyoumodelthedataasi.i.d.Poi .Yes,that'sagoodanswer tothepreviousquestion. i.Plotthelikelihoodfunctionfor ii.Find ^ iii.WhatcanyousayabouttherateofdeathbyhorsekickinthePrussiancalvaryattheendofthe19thcentury? eIsthereanyevidencethatdifferentcorpshaddifferentdeathrates?How wouldyouinvestigatethatpossibility? 9.UsethedatafromExample2.8.Findthem.l.e.for 10. X 1 ;:::;X n Normal ; 1 .Multiplechoice:Them.l.e. ^ isfoundfromthe equation a d d d dx f x 1 ;:::;x n j =0 b d d f x 1 ;:::;x n j =0 c d dx f x 1 ;:::;x n j =0 11.ThisexercisedealswiththelikelihoodfunctionforNormaldistributions. aLet y 1 ;:::;y n i.i.d.N ; 1 .Find ` intermsof y 1 ;:::;y n .

PAGE 208

2.8.EXERCISES 195 bShowthat ` dependsonlyon P y i andnotonthespecicvaluesof theindividual y i 's. cLet n =10 andchooseavaluefor .Use R togenerateasampleofsize 10fromN ; 1 .Plotthelikelihoodfunction.Howaccuratelycanyou estimate fromasampleofsize10? dLet y 1 ;:::;y 10 i.i.d.N ; where isknownbutnotnecessarilyequal to1.Find ` intermsof y 1 ;:::;y 10 and eLet y 1 ;:::;y 10 i.i.d.N ; where isknownbut isunknown.Find ` intermsof y 1 ;:::;y 10 and 12.Let y 1 ;:::;y n beasamplefromN ; 1 .Showthat ^ = y isthem.l.e. 13.Let y 1 ;:::;y n beasamplefromN ; where isknown.Showthat ^ 2 = n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 P y i )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 isthem.l.e. 14.Recallthe discoveries datafrompage10onthenumberofgreatdiscoveries eachyear.Let Y i bethenumberofgreatdiscoveriesinyear i andsuppose Y i Poi .Plotthelikelihoodfunction ` .Figure1.3suggestedthat 3 : 1 explainedthedatareasonablywell.Howsurecanwebeaboutthe 3.1? 15.JustifyeachstepofEquation2.6. 16.Page159discussesasimulationexperimentcomparingthesamplemeanand samplemedianasestimatorsofapopulationmean.Figure2.27showsthe resultsofthesimulationexperiment.Noticethattheverticalscaledecreases frompanelatob,toc,tod.Why?Giveaprecisemathematical formulafortheamountbywhichtheverticalscaleshoulddecrease.Doesthe actualdecreaseagreewithyourformula? 17.Inthemedicalscreeningexampleonpage165,ndtheprobabilitythatthe patienthasthediseasegiventhatthetestisnegative. 18. Adrugtestingexample 19.CountryAsuspectscountryBofhavinghiddenchemicalweapons.Basedon secretinformationfromtheirintelligenceagencytheycalculate P[ Bhasweapons ]= : 8 .ButthencountryBagreestoinspections,soAsends inspectors.Iftherearenoweaponsthenofcoursetheinspectorswon'tnd

PAGE 209

2.8.EXERCISES 196 any.Butifthereareweaponsthentheywillbewellhidden,withonlya20% chanceofbeingfound.I.e., P[ ndingweapons j weaponsexist ]= : 2 : .19 Noweaponsarefound.FindtheprobabilitythatBhasweapons.I.e.,nd Pr[ Bhasweapons j noweaponsarefound ] : 20.Let T betheamountoftimeacustomerspendsonHoldwhencallingthe computerhelpline.Assumethat T exp where isunknown.Asample of n callsisrandomlyselected.Let t 1 ;:::;t n bethetimesspentonHold. aChooseavalueof fordoingsimulations. bUse R tosimulateasampleofsize n =10 cPlot ` andnd ^ dAbouthowaccuratelycanyoudetermine ? eShowthat ` dependsonlyon P t i andnotonthevaluesoftheindividual t i 's. 21.Therearetwocoins.Oneisfair;theotheristwo-headed.Yourandomly chooseacoinandtossit. aWhatistheprobabilitythecoinlandsHeads? bWhatistheprobabilitythecoinistwo-headedgiventhatitlanded Heads? cWhatistheprobabilitythecoinistwo-headedgiventhatitlandedTails? Giveaformalproof,notintuition. dYouareabouttotossthecoinasecondtime.Whatistheprobability thatthesecondtosslandsHeadsgiventhatthersttosslandedHeads? 22.Therearetwocoins.ForcoinA, P[ H ]=1 = 4 ;forcoinB, P[ H ]=2 = 3 .You randomlychooseacoinandtossit. aWhatistheprobabilitythecoinlandsHeads? bWhatistheprobabilitythecoinisAgiventhatitlandedHeads?What istheprobabilitythecoinisAgiventhatitlandedTails?

PAGE 210

2.8.EXERCISES 197 cYouareabouttotossthecoinasecondtime.Whatistheprobabilitythe secondtosslandsHeadsgiventhatthersttosslandedHeads? 23.AtDupontCollegeapologiestoTomWolfeMathSATscoresamongmath majorsaredistributedN ; 50 whileMathSATscoresamongnon-math majorsaredistributedN ; 50 .5%ofthestudentsaremathmajors.A randomlychosenstudenthasamathSATscoreof720.Findtheprobability thatthestudentisamathmajor. 24.TheGreatRandiisaprofessedpsychicandclaimstoknowtheoutcomeof coinips.Thisproblemconcernsasequenceof20coinipsthatRandiwill trytoguessornotguess,ifhisclaimiscorrect. aTaketheprior P[ Randiispsychic ]= : 01 i.Beforeanyguesseshavebeenobserved,nd P[ rstguessiscorrect ] and P[ rstguessisincorrect ] ii.Afterobserving10consecutivecorrectguesses,ndtheupdated P[ Randiispsychic ] iii.Afterobserving10consecutivecorrectguesses,nd P[ nextguessiscorrect ] and P[ nextguessisincorrect ] iv.Afterobserving20consecutivecorrectguesses,nd P[ nextguessiscorrect ] and P[ nextguessisincorrect ] bTwostatisticsstudents,askepticandabelieverdiscussRandiafterclass. Believer: Ibelieveher,Ithinkshe'spsychic. Skeptic: Idoubtit.Ithinkshe'sahoax. Believer: Howcouldyoubeconvinced?WhatifRandiguessed10ina row?Whatwouldyousaythen? Skeptic: Iwouldputthatdowntoluck.Butifsheguessed20inarowthen Iwouldsay P[ Randicanguesscoinips ] : 5 Findtheskeptic'spriorprobabilitythatRandicanguesscoinips. cSupposethatRandidoesn'tclaimtoguesscointossesperfectly,onlythat shecanguessthematbetterthan50%.100trialsareconducted.Randi gets60correct.WritedownH 0 andH a appropriatefortestingRandi's claim.Dothedatasupporttheclaim?Whatif70werecorrect?Would thatsupporttheclaim? dTheGreatSandi,astatistician,writesthefollowing R codetocalculate aprobabilityforRandi.

PAGE 211

2.8.EXERCISES 198 y<-rbinom500,100,.5 sumy==60/500 WhatisSanditryingtocalculate?WriteaformulaDon'tevaluateit. forthequantitySandiistryingtocalculate. 25.Let w bethefractionoffreethrowsthatShaquilleO'Nealoranyotherplayer ofyourchoosingmakesduringthenextNBAseason.Findadensitythat approximatelyrepresentsyourprioropinionfor w 26.Let t betheamountoftimebetweenthemomentwhenthesunrsttouches thehorizonintheafternoonandthemomentwhenitsinkscompletelybelow thehorizon.Withoutmakinganyobservations,assessyourdistributionfor t 27.Assessyourpriordistributionfor b ,theproportionofM&M'sthatarebrown. BuyasmanyM&M'sasyoulikeandcountthenumberofbrowns.Calculate yourposteriordistribution. 28.aLet y N ; 1 andletthepriordistributionfor be N ; 1 i.When y hasbeenobserved,whatistheposteriordensityof ? ii.Showthatthedensityinparti.isaNormaldensity. iii.FinditsmeanandSD. bLet y N ; y andletthepriordistributionfor be N m; Supposethat y m ,and areknownconstants. i.When y hasbeenobserved,whatistheposteriordensityof ? ii.Showthatthedensityinparti.isaNormaldensity. iii.FinditsmeanandSD. cLet y 1 ;:::;y n beasampleofsize n fromN ; y andletthepriordistributionfor be N m; .Supposethat y m ,and areknown constants. i.When y 1 ;:::;y n havebeenobserved,whatistheposteriordensity of ? ii.Showthatthedensityinparti.isaNormaldensity. iii.FinditsmeanandSD. d Anexamplewithdata. 29.VerifyEquations2.16,2.17,and2.18.

PAGE 212

2.8.EXERCISES 199 30.Refertothediscussionofpredictiveintervalsonpage175.Justifytheclaim that ; )]TJ/F41 11.9552 Tf 9.298 0 Td [(: 72 )]TJ/F15 11.9552 Tf 9.299 0 Td [(3 : 65 ; )]TJ/F15 11.9552 Tf 9.299 0 Td [(0 : 36 ,and )]TJ/F15 11.9552 Tf 9.298 0 Td [(3 : 28 ; 1 are90%predictionintervals.Findthecorresponding80%predictionintervals. 31.aFollowingExample2.17pg.177,nd Pr[ y f = k j y 1 ;:::y n ] for k = 1 ; 2 ; 3 ; 4 bUsingtheresultsfromparta,makeaplotanalagoustoFigure2.33 pg.176. 32.Supposeyouwanttotestwhethertherandomnumbergeneratorin R generateseachofthedigits 0 ; 1 ;:::; 9 withprobability0.1.Howcouldyoudoit? Youmayconsiderrsttestingwhether R generates0withtherightfrequency, thenrepeatingtheanalysisforeachdigit. 33.aRepeattheanalysisofExample2.18pg.184,butfor dose =1 and dose =2 bTestthehypothesisthatincreasingthedosefrom1to2makesnodifferenceintoothgrowth. cTestthehypothesisthattheeffectofincreasingthedosefrom1to2is thesamefor supp = VC asitisfor supp = OJ dDotheanswerstopartsa,bandcagreewithyoursubjectiveassessmentofFigures2.2,2.3,and2.6? 34.ContinueExercise36fromChapter1.Theautoganzfeldtrialsresultedin X =122 aWhatistheparameterinthisproblem? bPlotthelikelihoodfunction. cTestthenoESP,nocheatinghypothesis. dAdoptandplotareasonableandmathematicallytractablepriordistributionfortheparameter.Computeandplottheposteriordistribution. eFindtheprobabilityofamatchonthenexttrialgiven X =122 fWhatdoyouconclude? 35.ThreebiologistsnamedAsiago,Brie,andCheshirearestudyingamutation inmorningglories,aspeciesofoweringplant.Themutationcausesthe owerstobewhiteratherthancolored.Butitisnotknownwhetherthe

PAGE 213

2.8.EXERCISES 200 mutationhasanyeffectontheplants'tness.Tostudythequestion,each biologisttakesarandomsampleofmorningglorieshavingthemutation, countstheseedsthateachplantproduces,andcalculatesalikelihoodsetfor theaveragenumberofseedsproducedbymutatedmorningglories. Asiagotakesasampleofsize n A =100 andcalculatesaLS : 1 set.Brietakesa sampleofsize n B =400 andcalculatesaLS : 1 set.Cheshiretakesasampleof size n C =100 andcalculatesaLS : 2 set. aWhowillgetthelongerinterval,AsiagoorBrie?Abouthowmuchlonger willitbe?Explain. bWhowillgetthelongerinterval,AsiagoorCheshire?Abouthowmuch longerwillitbe?Explain. 36.Inthe1990's,acommitteeatMITwrote AStudyontheStatusofWomen FacultyinScienceatMIT .In1994therewere15womenamongthe209 tenuredwomeninthesixdepartmentsoftheSchoolofScience.Theyfound, amongotherthings,thattheamountofresourcesmoney,labspace,etc. giventowomenwas,onaverage,lessthantheamountgiventomen.The reportgoesontoposethequestion: Giventhetinynumberofwomenfaculty inanydepartmentonemightaskifitispossibletoobtainsignicantdatato supportaclaimofgenderdifferences.... Whatdoesstatisticssayaboutit?Focusonasingleresource,saylaboratory space.Thedistributionoflabspaceislikelytobeskewed.I.e.,therewill beafewpeoplewithlotsmorespacethanmostothers.Solet'smodelthe distributionoflabspacewithanExponentialdistribution.Let x 1 ;:::;x 15 be theamountsofspacegiventotenuredwomen,so x i Exp w forsome unknownparameter w .Let M betheaveragelabspacegiventotenured men.Assumethat M isknowntobe100,fromthelargenumberoftenured men.Ifthereisnodiscrimination,then w =100 w isE x i ChrisStatswritesthefollowing R code. y<-rexp,.01 m<-meany s<-sqrtvary/15 lo<-m-2*s hi<-m+2*s

PAGE 214

2.8.EXERCISES 201 Whatis y supposedtorepresent?Whatis lo,hi supposedtorepresent? NowChrisputsthecodeinaloop. n<-0 foriin1:1000{ y<-rexp5,.01 m<-meany s<-sqrtvary lo<-m-2*s hi<-m+2*s iflo<100&hi>100n<-n+1 } printn/1000 Whatis n/1000 supposedtorepresent?Ifasamplesizeof15issufciently largefortheCentralLimitTheoremtoapply,thenwhat,approximately,isthe valueof n/1000 ? 37.Refertothe R codeinExample2.1pg.100.Whywasitnecessarytohave abrace { aftertheline forjin1:3 butnotaftertheline foriin1:2 ?

PAGE 215

C HAPTER 3 R EGRESSION 3.1Introduction Regressionisthestudyofhowthedistributionofonevariable, Y ,changesaccordingtothevalueofanothervariable, X R comeswithmanydatasetsthatoffer regressionexamples.FourareshowninFigure3.1. 1.Thedataset attenu containsdataonseveralvariablesfrom182earthquakes, includinghypocenter-to-stationdistanceandpeakacceleration.Figure3.1a showsaccelerationplottedagainstdistance.Thereisaclearrelationshipbetween X = distanceandthedistributionof Y = acceleration.When X is small,thedistributionof Y hasalongright-handtail.Butwhen X islarge, Y isalwayssmall. 2.Thedataset airquality containsdataaboutairqualityinNewYorkCity. Ozonelevels Y areplottedagainsttemperature X inFigure3.1b.When X issmallthenthedistributionof Y isconcentratedonvaluesbelowabout50 orso.Butwhen X islarge, Y canrangeuptoabout150orso. 3.Figure3.1cshowsdatafrom mtcars .Weightisontheabcissaandthetype oftransmissionmanual=1,automatic=0isontheordinate.Thedistributionofweightisclearlydifferentforcarswithautomatictransmissionsthan forcarswithmanualtransmissions. 4.Thedataset faithful containsdataabouteruptionsoftheOldFaithful geyserinYellowstoneNationalPark.Figure3.1dshows Y = timetonexteruption plottedagainst X = durationofcurrenteruption.Smallvaluesof X tendto indicatesmallvaluesof Y 202

PAGE 216

3.1.INTRODUCTION 203 Figure3.1:Fourregressionexamples

PAGE 217

3.1.INTRODUCTION 204 Figure3.1wasproducedbythefollowing R snippet. parmfrow=c,2 dataattenu plotattenu$dist,attenu$accel,xlab="Distance", ylab="Acceleration",main="a",pch="." dataairquality plotairquality$Temp,airquality$Ozone,xlab="temperature", ylab="ozone",main="b",pch="." datamtcars stripchartmtcars$wt~mtcars$am,pch=1,xlab="Weight", method="jitter",ylab="ManualTransmission", main="c" datafaithful plotfaithful,pch=".",main="d" Bothcontinuousanddiscretevariablescanturnupinregressionproblems.In the attenu airquality and faithful datasets,both X and Y arecontinuous.In mtcars ,itseemsnaturaltothinkofhowthedistributionof Y = weightvarieswith X = transmission,inwhichcase X isdiscreteand Y iscontinuous.Butwecould alsoconsiderhowthefractionofcars Y withautomatictransmissionsvariesasa functionof X = weight,inwhichcase Y isdiscreteand X iscontinuous. Inmanyregressionproblemswejustwanttodisplaytherelationshipbetween X and Y .Oftenascatterplotorstripchartwillsufce,asinFigure3.1.Other times,wewilluseastatisticalmodeltodescribetherelationship.Thestatistical modelmayhaveunknownparameterswhichwemaywishtoestimateorotherwise makeinferencefor.Examplesofparametricmodelswillcomelater.Ourstudyof regressionbeginswithdatadisplay. Inmanyinstancesasimpleplotisenoughtoshowtherelationshipbetween X and Y .Butsometimestherelationshipisobscuredbythescatterofpoints.Thenit helpstodrawasmoothcurvethroughthedata.Examples3.1and3.2illustrate. Example3.1 DraftLottery

PAGE 218

3.1.INTRODUCTION 205 Theresultofthe1970draftlotteryisavailableat DASL .Thewebsiteexplains: In1970,Congressinstitutedarandomselectionprocessforthemilitary draft.All366possiblebirthdateswereplacedinplasticcapsulesinarotating drumandwereselectedonebyone.Therstdatedrawnfromthedrum receiveddraftnumberoneandeligiblemenbornonthatdateweredrafted rst.Inatrulyrandomlotterythereshouldbenorelationshipbetweenthe dateandthedraftnumber. Figure3.2showsthedata,with X = dayofyearand Y = draftnumber.Thereis noapparentrelationshipbetween X and Y Figure3.2wasproducedwiththefollowingsnippet. plotdraft$Day.of.year,draft$Draft.No, xlab="Dayofyear",ylab="Draftnumber" Moreformally,arelationshipbetween X and Y usuallymeansthattheexpectedvalue of Y isdierentfordierentvaluesof X .Wedon'tconsiderchangesinSDorother aspectsofthedistributionhere.Typically,when X isacontinuousvariable,changesin Y aresmooth,sowewouldadoptthemodel E [ Y j X ]= g X .1 forsomeunknownsmoothfunction g R hasavarietyofbuilt-infunctionstoestimate g .Thesefunctionsarecalled scatterplotsmoothers ,forobviousreasons.Figure3.3showsthedraftlotterydatawithtwo scatterplotsmootherestimatesof g .Bothestimatesshowacleartrend,thatbirthdays laterintheyearweremorelikelytohavelowdraftnumbers.checkthis:Followingdiscoveryofthistrend,theprocedurefordrawingdraftnumberswaschangedinsubsequent years. Figure3.3wasproducedwiththefollowingsnippet. x<-draft$Day.of.year y<-draft$Draft.No plotx,y,xlab="Dayofyear",ylab="Draftnumber" lineslowessx,y linessupsmux,y,lty=2

PAGE 219

3.1.INTRODUCTION 206 Figure3.2:1970draftlottery.Draftnumbervs.dayofyear

PAGE 220

3.1.INTRODUCTION 207 Figure3.3:1970draftlottery.Draftnumbervs.dayofyear.Solidcurvetby lowess ;dashedcurvetby supsmu .

PAGE 221

3.1.INTRODUCTION 208 lowess locallyweightedscatterplotsmootherand supsmu supersmootherare twoof R 'sscatterplotsmoothers.Inthegure,the lowess curveislesswiggly thanthe supsmu curve.Eachsmootherhasatuningparameterthatcanmakethe curvemoreorlesswiggly.Figure3.3wasmadewiththedefaultvaluesforboth smoothers. Example3.2 Seedlings,continued AsmentionedinExample1.6,theseedlingsstudywascarriedoutattheCoweetaLong TermEcologicalResearchstationinwesternNorthCarolina.Therewereveplotsat dierentelevationsonahillside.Withineachplottherewasa60m 1mstriprunning alongthehillsidedividedinto601m 1mquadrats.Itispossiblethatthearrivalrate ofNewseedlingsandthesurvivalratesofbothOldandNewseedlingsaredierentin dierentplotsanddierentquadrats.Figure3.4showsthetotalnumberofNewseedlings ineachofthequadratsinoneoftheveplots.The lowess curvebringsoutthespatial trend:lownumberstotheleft,apeakaroundquadrat40,andaslightfallingoby quadrat60. Figure3.4wasproducedby plottotal.new,xlab="quadratindex", ylab="totalnewseedlings" lineslowesstotal.new Inaregressionproblemthedataarepairs x i ;y i for i =1 ;:::;n .Foreach i y i isarandomvariablewhosedistributiondependson x i .Wewrite y i = g x i + i : .2 Equation3.2expresses y i asasystematicorexplainablepart g x i andanunexplainedpart i g iscalledthe regressionfunction .Oftenthestatistician'sgoalisto estimate g .Asusual,themostimportanttoolisasimpleplot,similartothosein Figures3.1through3.4. Oncewehaveanestimate, ^ g ,fortheregressionfunction g eitherbyascatterplotsmootherorbysomeothertechniquewecancalculate r i y i )]TJ/F15 11.9552 Tf 12.803 0 Td [(^ g x i .The r i 'sareestimatesofthe i 'sandarecalled residuals .The i 'sthemselvesarecalled errors .Becausethe r i 'sareestimatestheyaresometimeswrittenwiththehat notation: ^ i = r i = estimateof i

PAGE 222

3.1.INTRODUCTION 209 Figure3.4:TotalnumberofNewseedlings19931997,byquadrat.

PAGE 223

3.2.NORMALLINEARMODELS 210 Residualsareusedtoevaluateandassessthetofmodelsfor g ,atopicwhichis beyondthescopeofthisbook. Inregressionweuseonevariabletoexplainorpredicttheother.Itiscustomary instatisticstoplotthepredictorvariableonthe x -axisandthepredictedvariable onthe y -axis.Thepredictorisalsocalledthe independent variable,the explanatory variable,the covariate ,orsimply x .Thepredictedvariableiscalledthe dependent variable,orsimply y .InEconomics x and y aresometimescalledthe exogenous and endogenous variables,respectively.Predictingorexplaining y from x isnot perfect;knowing x doesnottellus y exactly.Butknowing x doestellussomething about y andallowsustomakemoreaccuratepredictionsthanifwedidn'tknow x Regressionmodelsareagnosticaboutcausality.Infact,insteadofusing x to predict y ,wecoulduse y topredict x .Soforeachpairofvariablestherearetwo possibleregressions:using x topredict y andusing y topredict x .Sometimes neithervariablecausestheother.Forexample,considerasampleofcitiesandlet x bethenumberofchurchesand y bethenumberofbars.Ascatterplotof x and y willshowastrongrelationshipbetweenthem.Buttherelationshipiscausedby thepopulationofthecities.Largecitieshavelargenumbersofbarsandchurches andappearneartheupperrightofthescatterplot.Smallcitieshavesmallnumbers ofbarsandchurchesandappearnearthelowerleft. Scatterplotsmoothersarearelativelyunstructuredwaytoestimate g .Their outputfollowsthedatapointsmoreorlesscloselyasthetuningparameterallows ^ g tobemoreorlesswiggly.Sometimesanunstructuredapproachisappropriate, butnotalways.TherestofChapter3presentsmorestructuredwaystoestimate g 3.2NormalLinearModels 3.2.1Introduction InSection1.3.4westudiedtheNormaldistribution,usefulforcontinuouspopulationshavingacentraltendencywithroughlyequallysizedtails.InSection3.2we generalizetothecasewheretherearemanyNormaldistributionswithdifferent meanswhichdependinasystematicwayonanothervariable.Webeginourstudy withanexampleinwhichtherearethreedistinctdistributions. Example3.3 HotDogs,continued Figure3.5displayscaloriedataforthreetypesofhotdogs.Itappearsthatpoultryhot dogshave,onaverage,slightlyfewercaloriesthanbeeformeathotdogs.Howshould wemodelthesedata?

PAGE 224

3.2.NORMALLINEARMODELS 211 Figure3.5:Caloriecontentofhotdogs

PAGE 225

3.2.NORMALLINEARMODELS 212 Figure3.5wasproducedwith stripcharthotdogs$Calories~hotdogs$Type,pch=1, xlab="calories" Thereare20Beef,17Meatand17Poultryhotdogsinthesample.Wethinkof themassamplesfrommuchlargerpopulations.Figure3.6showsdensityestimatesof caloriecontentforthethreetypes.Foreachtypeofhotdog,thecaloriecontentscluster aroundacentralvalueandfallotoeithersidewithoutaparticularlylongleftorright tail.Soitisreasonable,atleastasarstattempt,tomodelthethreedistributionsas Normal.Sincethethreedistributionshaveaboutthesameamountofspreadwemodel themasallhavingthesameSD.Weadoptthemodel B 1 ;:::;B 20 i.i.d.N B ; M 1 ;:::;M 17 i.i.d.N M ; P 1 ;:::;P 17 i.i.d.N P ; ; .3 wherethe B i 's, M i 'sand P i 'sarethecaloriecontentsoftheBeef,MeatandPoultryhot dogsrespectively.Figure3.6suggests B 150; M 160; P 120; 30 : Anequivalentformulationis B 1 ;:::;B 20 i.i.d.N ; M 1 ;:::;M 17 i.i.d.N + M ; P 1 ;:::;P 17 i.i.d.N + P ; .4 Models3.3and3.4aremathematicallyequivalent.Eachhasthreeparametersforthe populationmeansandonefortheSD.Theydescribeexactlythesamesetofdistributions andtheparametersofeithermodelcanbewrittenintermsoftheother.Theequivalence isshowninTable3.3.ForthepurposeoffurtherexpositionweadoptModel3.4. Wewillseelaterhowtocarryoutinferencesregardingtheparameters.Fornowwe stopwiththemodel. Figure3.6wasproducedwiththefollowingsnippet.

PAGE 226

3.2.NORMALLINEARMODELS 213 Figure3.6:Densityestimatesofcaloriecontentsofhotdogs

PAGE 227

3.2.NORMALLINEARMODELS 214 parmfrow=c,1 plotdensityhotdogs$C[hotdogs$T=="Beef"],bw=20, xlim=c,250,yaxt="n",ylab="",xlab="calories", main="Beef" plotdensityhotdogs$C[hotdogs$T=="Meat"],bw=20, xlim=c,250,yaxt="n",ylab="",xlab="calories", main="Meat" plotdensityhotdogs$C[hotdogs$T=="Poultry"],bw=20, xlim=c,250,yaxt="n",ylab="",xlab="calories", main="Poultry" hotdogs$C and hotdogs$T illustrateaconvenientfeatureof R ,thatcomponents ofastructurecanbeabbreviated.Insteadoftyping hotdogs$Calories and hotdogs$Type wecanusetheabbreviations.Thesamethingappliestoarguments offunctions. density...,bw=20 speciesthe bandwidth ofthedensityestimate.Larger bandwidthgivesasmootherestimate;smallerbandwidthgivesamorewigglyestimate.Trydierentbandwidthstoseewhattheydo. The PlantGrowth datasetin R providesanotherexample.As R explains,the datapreviouslyappearedinDobson[1983]andare Resultsfromanexperimenttocompareyieldsasmeasuredbydried weightofplantsobtainedunderacontrolandtwodifferenttreatment conditions. Therstseverallinesare weightgroup 14.17ctrl 25.58ctrl 35.18ctrl Figure3.7showsthewholedataset. Itappearsthatplantsgrownunderdifferenttreatmentstendtohavedifferent weights.Inparticular,plantsgrownunderTreatment1appeartobesmalleronaveragethanplantsgrownundereithertheControlorTreatment2.Whatstatistical modelshouldweadopt?

PAGE 228

3.2.NORMALLINEARMODELS 215 Model3.3 Model3.4 Interpretation Approximatevalue B meancaloriecontent ofBeefhotdogs 150 M + M meancaloriecontent ofMeathotdogs 160 P + P meancaloriecontent ofPoultryhotdogs 120 M )]TJ/F41 11.9552 Tf 11.955 0 Td [( B M meancaloriedifferencebetweenBeef andMeathotdogs 10 P )]TJ/F41 11.9552 Tf 11.956 0 Td [( B P meancaloriedifferencebetweenBeef andPoultryhotdogs -30 SDofcaloriecontent withinasingletypeof hotdog 30 Table3.1:CorrespondencebetweenModels3.3and3.4 Figure3.7:The PlantGrowth data

PAGE 229

3.2.NORMALLINEARMODELS 216 Figure3.7wasproducedwiththefollowingsnippet. stripchartPlantGrowth$weight~PlantGrowth$group,pch=1, xlab="weight" First,wethinkofthe10plantsgrownundereachconditionasasamplefroma muchlargerpopulationofplantsthatcouldhavebeengrown.Second,alookatthe datasuggeststhattheweightsineachgroupareclusteredaroundacentralvalue, approximatelysymmetricallywithoutanespeciallylongtailineitherdirection.So wemodeltheweightsashavingNormaldistributions. Butweshouldallowforthepossibilitythatthethreepopulationshavedifferent means.WedonotaddressthepossibilityofdifferentSD'shere.Let bethe populationmeanofplantsgrownundertheControlcondition, 1 and 2 bethe extraweightduetoTreatment1andTreatment2respectively,and betheSD. Weadoptthemodel W C; 1 ;:::;W C; 10 i.i.d.N ; W T 1 ; 1 ;:::;W T 1 ; 10 i.i.d.N + 1 ; W T 2 ; 1 ;:::;W T 2 ; 10 i.i.d.N + 2 ; : .5 Thereisamathematicalstructuresharedby3.4,3.5andmanyotherstatistical models,andsomecommonstatisticalnotationtodescribeit.We'llusethehotdog datatoillustrate. Example3.4 HotDogs,continued Example3.4continuesExample2.2.First,thereisthemainvariableofinterest,often calledthe responsevariable anddenoted Y .Forthehotdogdata Y iscaloriecontent. Anotheranalysiscouldbemadeinwhich Y issodiumcontent. Thedistributionof Y isdierentunderdierentcircumstances.Inthisexample, Y hasaNormaldistributionwhosemeandependsonthetypeofhotdog.Ingeneral,the distributionof Y willdependonsomequantityofinterest,calleda covariate regressor or explanatoryvariable .Covariatesareoftencalled X Thedataconsistsofmultipledatapoints,or cases .Wewrite Y i and X i forthe i 'th case.Itisusualtorepresentthedataasamatrixwithonerowforeachcase.One columnisfor Y ;theothercolumnsareforexplanatoryvariables.Forthehotdogdata thematrixis TypeCaloriesSodium

PAGE 230

3.2.NORMALLINEARMODELS 217 Beef186495 Beef181477 ... Meat140428 Meat138339 Poultry129430 Poultry132375 ... Foranalysisofcalories,thethirdcolumnisirrelevant. Rewritingthedatamatrixinaslightlydierentformrevealssomemathematicalstructurecommontomanymodels.Thereare54casesinthehotdogstudy.Let Y 1 ;:::;Y 54 betheircaloriecontents.Foreach i from1to54,denetwonewvariables X 1 ;i and X 2 ;i by X 1 ;i = 1 ifthe i 'thhotdogisMeat, 0 otherwise and X 2 ;i = 1 ifthe i 'thhotdogisPoultry, 0 otherwise. X 1 ;i and X 2 ;i areindicatorvariables.Twoindicatorvariablessucebecause,forthe i 'th hotdog,ifweknow X 1 ;i and X 2 ;i ,thenweknowwhattypeitis.Moregenerally,if thereare k populations,then k )]TJ/F15 11.9552 Tf 10.971 0 Td [(1 indicatorvariablessuce.Withthesenewvariables, Model3.4canberewrittenas Y i = + M X 1 ;i + P X 2 ;i + i .6 for i =1 ;:::; 54 ,where 1 ;:::; 54 i.i.d.N ; : Equation3.6isactually54separateequations,oneforeachcase.Wecanwritethem succinctlyusingvectorandmatrixnotation.Let Y = Y 1 ;:::;Y 54 t ; B = ; M ; P t ; E = 1 ;:::; 54 t ;

PAGE 231

3.2.NORMALLINEARMODELS 218 Thetransposeistherebecause,byconvention,vectorsarecolumnvectors.and X = 0 B B B B B B B B B B B B B B B B @ 100 100 . . . . 100 110 . . . . 110 101 . . . . 101 1 C C C C C C C C C C C C C C C C A X isa 54 3 matrix.Therst20linesarefortheBeefhotdogs;thenext17arefor theMeathotdogs;andthenal17areforthePoultryhotdogs.Equation3.6canbe written Y = XB + E .7 Equationssimilarto3.6and3.7arecommontomanystatisticalmodels.For the PlantGrowth datapage214let Y i = weightof i 'thplant ; X 1 ;i = 1 if i 'thplantreceivedtreatment1 0 otherwise X 2 ;i = 1 if i 'thplantreceivedtreatment2 0 otherwise Y = Y 1 ;:::;Y 30 t B = ; 1 ; 2 t E = 1 ;:::; 30 t

PAGE 232

3.2.NORMALLINEARMODELS 219 and X = 0 B B B B B B B B B B B B B B B B @ 100 100 . . . . 100 110 . . . . 110 101 . . . . 101 1 C C C C C C C C C C C C C C C C A Thenanalogouslyto3.6and3.7wecanwrite Y i = + 1 X 1 ;i + 2 X 2 ;i + i .8 and Y = XB + E : .9 NoticethatEquation3.6isnearlyidenticaltoEquation3.8andEquation3.7 isidenticaltoEquation3.9.Theirstructureiscommontomanystatisticalmodels. Each Y i iswrittenasthesumoftwoparts.Therstpart, XB + M X 1 ;i + P X 2 ;i for thehotdogs; + 1 X 1 ;i + 2 X 2 ;i for PlantGrowth iscalled systematic deterministic or signal andrepresentstheexplainabledifferencesbetweenpopulations.The secondpart, E ,or i ,israndom,ornoise,andrepresentsthedifferencesbetween hotdogsorplantswithinasinglepopulation.The i 'sarecalled errors .Instatistics, theworderrordoesnotindicateamistake;itsimplymeansthenoisepartofa model,orthepartleftunexplainedbycovariates.Modellingaresponsevariableas response = signal + noise isausefulwaytothinkandwillrecurthroughoutthisbook. In3.6thesignal + M X 1 ;i + P X 2 ;i isalinearfunctionof ; M ; P .In3.8 thesignal + 1 X 1 ;i + 2 X 2 ;i isalinearfunctionof ; 1 ; 2 .Modelsinwhichthe signalisalinearfunctionoftheparametersarecalled linearmodels Inourexamplessofar, X hasbeenanindicator.Foreachofanitenumber of X 'stherehasbeenacorrespondingpopulationof Y 's.Asthenextexample illustrates,linearmodelscanalsoarisewhen X isacontinuousvariable.

PAGE 233

3.2.NORMALLINEARMODELS 220 Example3.5 IceCreamConsumption Thisexamplecomesfrom DASL ,whichsays Icecreamconsumptionwasmeasuredover30four-weekperiodsfrom March18,1951toJuly11,1953.Thepurposeofthestudywastodetermine ificecreamconsumptiondependsonthevariablesprice,income,ortemperature.ThevariablesLag-tempandYearhavebeenaddedtotheoriginal data. Youcandownloadthedatafrom http://lib.stat.cmu.edu/DASL/Datafiles/IceCream.html Therstfewlineslooklikethis: dateICpriceincometempLag-tempYear 1.386.2707841560 2.374.2827956630 3.393.2778163680 Thevariablesare date Timeperiod-30ofthestudyfrom3/18/51to7/11/53 IC Icecreamconsumptioninpintspercapita Price Priceoficecreamperpintindollars Income Weeklyfamilyincomeindollars Temp MeantemperatureindegreesF Lag-temp Tempvariablelaggedbyonetimeperiod Year Yearwithinthestudy=1951,1=1952,2=1953 Figure3.8isaplotofconsumptionversustemperature.Itlooksasthoughanequation oftheform consumption = 0 + 1 temperature + error .10

PAGE 234

3.2.NORMALLINEARMODELS 221 woulddescribethedatareasonablywell.Thisisalinearmodel,notbecauseconsumption isalinearfunctionoftemperature,butbecauseitisalinearfunctionof 0 ; 1 .To writeitinmatrixform,let Y = IC 1 ;:::; IC 30 t B = 0 ; 1 t E = 1 ;:::; 30 t and X = 0 B B B @ 1 temp 1 1 temp 2 . . . 1 temp 30 1 C C C A Themodelis Y = XB + E : .11 Equation3.11isalinearmodel,identicaltoEquations3.7and3.9. Equation3.7equivalently,3.9or3.11isthebasicformofalllinearmodels. Linearmodelsareextremelyusefulbecausetheycanbeappliedtosomanykinds ofdatasets.Section3.2.2investigatessomeoftheirtheoreticalpropertiesand R 's functionsforttingthemtodata. 3.2.2InferenceforLinearModels Section3.2.1showedsomegraphicaldisplaysofdatathatwereeventuallydescribedbylinearmodels.Section3.2.2treatsmoreformalinferenceforlinear models.Webeginbyderivingthelikelihoodfunction. LinearmodelsaredescribedbyEquation3.7equivalently,3.9or3.11which werepeathereforconvenience: Y = XB + E : .12 Ingeneralthereisanarbitrarynumberofcases,say n ,andanarbitrarynumber ofcovariates,say p .Equation3.12isshorthandforthecollectionofunivariate equations Y i = 0 + 1 X 1 ;i + + p X p;i + i .13

PAGE 235

3.2.NORMALLINEARMODELS 222 Figure3.8:Icecreamconsumptionpintspercapitaversusmeantemperature F

PAGE 236

3.2.NORMALLINEARMODELS 223 orequivalently, Y i N i ; for i =1 ;:::;n where i = 0 + P j j X j;i andthe i 'sarei.i.d.N ; .Thereare p +2 parameters: 0 ;:::; p ; .Thelikelihoodfunctionis ` 0 ;:::; p ; = n Y i =1 p y i j 0 ;:::; p ; = n Y i =1 1 p 2 e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 y i )]TJ/F43 5.9776 Tf 5.756 0 Td [( i 2 = n Y i =1 1 p 2 e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 y i )]TJ/F40 5.9776 Tf 5.756 0 Td [( 0 + P j X j;i 2 = )]TJ/F15 11.9552 Tf 5.479 -9.684 Td [(2 2 )]TJ/F43 5.9776 Tf 7.782 3.259 Td [(n 2 e )]TJ/F40 5.9776 Tf 12.107 3.258 Td [(1 2 2 P i y i )]TJ/F39 7.9701 Tf 6.587 0 Td [( 0 + P j X j;i 2 .14 Likelihood3.14isafunctionofthe p +2 parameters.Tondthem.l.e.'swe coulddifferentiate3.14withrespecttoeachparameterinturn,setthederivatives equalto0,andsolve.Butitiseasiertotakethe log of3.14rst,thendifferentiate andsolve. log ` 0 ;:::; p ; = C )]TJ/F41 11.9552 Tf 11.956 0 Td [(n log )]TJ/F15 11.9552 Tf 19.058 8.087 Td [(1 2 2 X i y i )]TJ/F15 11.9552 Tf 11.955 0 Td [( 0 + X j j X i;j 2 forsomeirrelevantconstant C ,sowegetthesystemofequations 1 ^ 2 X i )]TJ/F41 11.9552 Tf 5.48 -9.683 Td [(y i )]TJ/F15 11.9552 Tf 11.955 0 Td [( ^ 0 + X j ^ j X i;j =0 1 ^ 2 X i )]TJ/F41 11.9552 Tf 5.48 -9.684 Td [(y i )]TJ/F15 11.9552 Tf 11.955 0 Td [( ^ 0 + X j ^ j X i;j X i; 1 =0 . 1 ^ 2 X i )]TJ/F41 11.9552 Tf 5.479 -9.684 Td [(y i )]TJ/F15 11.9552 Tf 11.955 0 Td [( ^ 0 + X j ^ j X i;j X i;p =0 )]TJ/F41 11.9552 Tf 10.541 8.088 Td [(n ^ + 1 ^ 3 X i )]TJ/F41 11.9552 Tf 5.48 -9.684 Td [(y i )]TJ/F15 11.9552 Tf 11.955 0 Td [( ^ 0 + X j ^ j X i;j 2 =0 .15

PAGE 237

3.2.NORMALLINEARMODELS 224 Notethehatnotationtoindicateestimates.Them.l.e.'s ^ 0 ;:::; ^ p ; ^ arethe valuesoftheparametersthatmakethederivativesequalto0andthereforesatisfy Equations3.15.Therst p +1 oftheseequationscanbemultipliedby 2 ,yielding p +1 linearequationsinthe p +1 unknown 's.Becausethey'relinear,theycanbe solvedbylinearalgebra.Thesolutionis ^ B = X t X )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X t Y ; usingthenotationofEquation3.12. Foreach i 2f 1 ;:::;n g ,let ^ y i = ^ 0 + x 1 i ^ 1 + + x pi ^ p : The ^ y i 'sarecalled tted values.Theresidualsare r i = y i )]TJ/F15 11.9552 Tf 12.747 0 Td [(^ y i = y i )]TJ/F15 11.9552 Tf 13.639 3.154 Td [(^ 0 + x 1 i ^ 1 + + x pi ^ p andareestimatesoftheerrors i .Finally,referringtothelastlineofEquation3.15, them.l.e. ^ isfoundfrom 0= )]TJ/F41 11.9552 Tf 10.541 8.088 Td [(n + 1 3 X i )]TJ/F41 11.9552 Tf 5.48 -9.684 Td [(y i )]TJ/F15 11.9552 Tf 11.955 0 Td [( 0 + X j j X i;j 2 = )]TJ/F41 11.9552 Tf 10.541 8.088 Td [(n + 1 3 X i r 2 i so ^ 2 = 1 n X r 2 i and ^ = P r 2 i n 1 2 .16 Inadditiontothem.l.e.'sweoftenwanttolookatthelikelihoodfunctionto judge,forexample,howaccuratelyeach canbeestimated.Thelikelihoodfunctionforasingle i comesfromtheCentralLimitTheorem.Wewillnotworkout themathherebut,fortunately, R willdoallthecalculationsforus.Weillustrate withthehotdogdata.

PAGE 238

3.2.NORMALLINEARMODELS 225 Example3.6 HotDogs,continued Estimatingtheparametersofamodeliscalled ttingamodeltodata R hasbuilt-in commandsforttingmodels.ThefollowingsnippettsModel3.7tothehotdogdata. Thesyntaxissimilarformanymodelttingcommandsin R ,soitisworthspendingsome timetounderstandit. hotdogs.fit<-lmhotdogs$Calories~hotdogs$Type lm standsforlinearmodel. standsforisafunctionof.Itisusedinmanyof R 'smodellingcommands. y x iscalleda formula andmeansthat y ismodelledasafunctionof x .Inthecase athand,CaloriesismodelledasafunctionofType. lm speciesthetypeofmodel. R automaticallycreatesthe X matrixinEquation3.12andestimatestheparameters. Theresultofttingthemodelisstoredinanewobjectcalled hotdogs.fit .Of coursewecouldhavecalleditanythingwelike. lm canhaveanargument data ,whichspeciesa dataframe .Soinsteadof hotdogs.fit<-lmhotdogs$Calories~hotdogs$Type wecouldhavewritten hotdogs.fit<-lmCalories~Type,data=hotdogs Youmaywanttotrythistoseehowitworks. Tosee hotdogs.fit ,use R 's summary function.It'suseandtheresultingoutputare showninthefollowingsnippet.

PAGE 239

3.2.NORMALLINEARMODELS 226 >summaryhotdogs.fit Call: lmformula=hotdogs$Calories~hotdogs$Type Residuals: Min1QMedian3QMax -51.706-18.492-5.27822.50036.294 Coefficients: EstimateStd.ErrortvaluePr>|t| Intercept156.8505.24629.901<2e-16*** hotdogs$TypeMeat1.8567.7390.2400.811 hotdogs$TypePoultry-38.0857.739-4.9219.4e-06*** --Signif.codes:0`***'0.001`**'0.01`*'0.05`.'0.1`'1 Residualstandarderror:23.46on51degreesoffreedom MultipleR-Squared:0.3866,AdjustedR-squared:0.3626 F-statistic:16.07on2and51DF,p-value:3.862e-06 Themostimportantpartoftheoutputisthetablelabelled Coefficients: .Thereis onerowofthetableforeachcoecient.Theirnamesareontheleft.Inthistable thenamesare Intercept hotdogs$TypeMeat ,and hotdogs$TypePoultry .Therst columnislabelled Estimate .Thosearethem.l.e.'s. R hastthemodel Y i = 0 + 1 X 1 ;i + 2 X 2 ;i + i where X 1 and X 2 areindicatorvariablesforthetypeofhotdog.Themodelimplies Y i = 0 + i forbeefhotdogs Y i = 0 + 1 + i formeathotdogs Y i = 0 + 2 + i forpoultryhotdogs Thereforethenamesmean 0 = Intercept = meancaloriecontentofbeefhotdogs 1 = hotdogs$TypeMeat = meandierencebetweenbeefandmeathotdogs 2 = hotdogs$TypePoultry = meandierencebetween beefandpoultryhotdogs

PAGE 240

3.2.NORMALLINEARMODELS 227 Fromthe Coefficients tabletheestimatesare ^ 0 =156 : 850 ^ 1 =1 : 856 ^ 2 = )]TJ/F15 11.9552 Tf 9.299 0 Td [(38 : 085 Thenextcolumnofthetableislabelled Std.Error .ItcontainstheSD'softhe estimates.Inthiscase, ^ 0 hasanSDofabout5.2; ^ 1 hasanSDofabout7.7,and ^ 2 alsohasanSDofabout7.7.TheCentralLimitTheoremsaysthatapproximately,in largesamples ^ 0 N 0 ; 0 ^ 1 N 1 ; 1 ^ 2 N 2 ; 2 TheSD'sinthetableareestimatesoftheSD'sintheCentralLimitTheorem. Figure3.9plotsthelikelihoodfunctions.Theinterpretationisthat 0 islikelysomewherearound157,plusorminusabout10orso; 1 issomewherearound2,plusor minusabout15orso;and 2 issomewherearound-38,plusorminusabout15orso. ComparetoTable3.3.Inparticular,thereisnostrongevidencethatMeathotdogs have,onaverage,moreorfewercaloriesthanBeefhotdogs;butthereisquitestrong evidencethatPoultryhotdogshaveconsiderablyfewer. Figure3.9wasproducedwiththefollowingsnippet. m<-c156.85,1.856,-38.085 s<-c5.246,7.739,7.739 parmfrow=c,2 x<-seqm[1]-3*s[1],m[1]+3*s[1],length=40 plotx,dnormx,m[1],s[1],type="l", xlab=expressionmu,ylab="likelihood",yaxt="n" x<-seqm[2]-3*s[2],m[2]+3*s[2],length=40 plotx,dnormx,m[2],s[2],type="l", xlab=expressiondelta[M], ylab="likelihood",yaxt="n"

PAGE 241

3.2.NORMALLINEARMODELS 228 Figure3.9:Likelihoodfunctionsfor ; M ; P intheHotDogexample.

PAGE 242

3.2.NORMALLINEARMODELS 229 x<-seqm[3]-3*s[3],m[3]+3*s[3],length=40 plotx,dnormx,m[3],s[3],type="l", xlab=expressiondelta[P], ylab="likelihood",yaxt="n" Thesummaryalsogivesanestimateof .Theestimateislabelled Residual standarderror .Inthiscase, ^ 23 : 46 1 Soourmodelsaysthatforeachtype ofhotdog,thecaloriecontentshaveapproximatelyaNormaldistributionwithSDabout 23orso.ComparetoFigure3.5toseewhetherthe23.46makessense. Regressionissometimesusedinanexploratorysetting,whenscientistswant tondoutwhichvariablesarerelatedtowhichothervariables.Oftenthereisa responsevariable Y ,imagine,forexample,performanceinschoolandtheywant toknowwhichothervariablesaffect Y imagine,forexample,poverty,amountof televisionwatching,computerinthehome,parentalinvolvment,etc.Example3.7 illustratestheprocess. Example3.7 mtcars Thisexampleuseslinearregressiontoexplorethe R dataset mtcars SeeFigure3.1, panelcmorethoroughlywiththegoalofmodellingmpgmilespergallonasa functionoftheothervariables.Asusual,type datamtcars toloadthedatainto R and helpmtcars foranexplanation.As R explains: Thedatawasextractedfromthe1974MotorTrendUSmagazine,and comprisesfuelconsumptionand10aspectsofautomobiledesignandperformancefor32automobiles-74models. Inanexploratoryexercisesuchasthis,itoftenhelpstobeginbylookingatthedata. Accordingly,Figure3.10isapairsplotofthedata,usingjustthecontinuousvariables. Figure3.10wasproducedby pairsmtcars[,c,3:7] 1 R ,likemoststatisticalsoftware,doesnotreportthem.l.e.butreportsinstead ^ P r 2 i = n )]TJ/F11 9.9626 Tf -420.649 -11.955 Td [(p )]TJ/F8 9.9626 Tf 10.259 0 Td [(1 1 = 2 .ComparetoEquation3.16forthem.l.e.inwhichthedenominatoris n .Thesituation issimilartothesampleSDonpage98.When n p thereislittledifferencebetweenthetwo estimates.

PAGE 243

3.2.NORMALLINEARMODELS 230 Figure3.10: pairs plotofthe mtcars data.Type helpmtcars in R foranexplanation.

PAGE 244

3.2.NORMALLINEARMODELS 231 Clearly,mpgisrelatedtoseveraloftheothervariables.Weightisanobviousand intuitiveexample.Theguresuggeststhatthelinearmodel mpg = 0 + 1 wt + .17 isagoodstarttomodellingthedata.Figure3.11aisaplotofmpgvs.weightplus thettedline.Theestimatedcoecientsturnouttobe ^ 0 37 : 3 and ^ 1 )]TJ/F15 11.9552 Tf 23.116 0 Td [(5 : 34 Theinterpretationisthatmpgdecreasesbyabout5.34forevery1000poundsofweight. Note:thisdoesnotmeanthatifyouputa1000poundweightinyourcaryourmileage mpgwilldecreaseby5.34.ItmeansthatifcarAweighsabout1000poundslessthan carB,thenweexpectcarAtogetanextra5.34milespergallon.Buttherearelikely manydierencesbetweenAandBbesidesweight.The5.34accountsforallofthose dierences,onaverage. Wecouldjustaseasilyhavebegunbyttingmpgasafunctionofhorsepowerwith themodel mpg = 0 + 1 hp + .18 Weuse 'stodistinguishthecoecientsinEquation3.18fromthoseinEquation3.17. Them.l.e.'sturnouttobe ^ 0 30 : 1 and ^ 1 )]TJ/F15 11.9552 Tf 24.138 0 Td [(0 : 069 .Figure3.11bshowsthe correspondingscatterplotandttedline.Whichmodeldoweprefer?Choosingamong dierentpossiblemodelsisamajorareaofstatisticalpracticewithalargeliteraturethat canbehighlytechnical.Inthisbookweshowjustafewconsiderations. Onewaytojudgemodelsisthrough residualplots ,whichareplotsofresidualsversus either X variablesorttedvalues.Ifmodelsareadequate,thenresidualplotsshould shownoobviouspatterns.Patternsinresidualplotsarecluestomodelinadequacyand howtoimprovemodels.Figure3.11canddareresidualplotsfor mpg.fit1 mpg vs.wtand mpg.fit2 mpgvs.hp.Therearenoobviouspatternsinpanelc.In paneldthereisasuggestionofcurvature.Forttedvaluesbetweenabout15and23, residualstendtobelowbutforttedvalueslessthanabout15orgreaterthanabout 23,residualstendtobehighThesamepatternmighthavebeennotedinpanelb. suggestingthatmpgmightbebettertasanonlinearfunctionofhp.Wedonotpursue thatsuggestionfurtheratthemoment,merelynotingthattheremaybeaminorawin mpg.fit2 andwethereforeslightlyprefer mpg.fit1 Anotherthingtonotefrompanelscanddistheoverallsizeoftheresiduals. Inc,theyrunfromabout-4toabout+6,whileindtheyrunfromabout-6toabout +6.Thatis,theresidualsfrom mpg.fit2 tendtobeslightlylargerinabsolutevalue thantheresidualsfrom mpg.fit1 ,suggestingthatwtpredictsmpgslightlybetterthan doeshp.Thatimpressioncanbeconrmedbygettingthe summary ofbothtsand checking ^ .From mpg.fit1 ^ 3 : 046 whilefrom mpg.fit2 ^ 3 : 863 .I.e.,fromwt

PAGE 245

3.2.NORMALLINEARMODELS 232 wecanpredictmpgtowithinabout6orsotwoSD'swhilefromhpwecanpredict mpgonlytowithinabout7.7orso.Forthisreasontoo,weslightlyprefer mpg.fit1 to mpg.fit2 Whataboutthepossibilityofusingbothweightandhorsepowertopredictmpg? Consider mpg.fit3<-lmmpg~wt+hp,data=mtcars Theformula yx1+x2 meanst y asafunctionofboth x1 and x2 .Inour examplethatmeans mpg = 0 + 1 wt 1 + 2 hp 2 + .19 Aresidualplotfrommodel3.19isshowninFigure3.11e.Them.l.e.'sare ^ 0 37 : 2 ^ 1 )]TJ/F15 11.9552 Tf 21.917 0 Td [(3 : 88 ^ 2 )]TJ/F15 11.9552 Tf 21.917 0 Td [(0 : 03 ,and ^ 2 : 6 .Sincetheresidualplotlookscurved,Model3.17 hasresidualsaboutassmallasModel3.19,andModel3.17ismoreparsimoniousthan Model3.19weslightlypreferModel3.17. Figure3.11awasproducedwith plotmtcars$wt,mtcars$mpg,xlab="weight",ylab="mpg" mpg.fit1<-lmmpg~wt,data=mtcars ablinecoefmpg.fit1 Figure3.11,panelscanddwereproducedwith #panelc plotfittedmpg.fit1,residmpg.fit1,main="c", xlab="fittedvaluesfromfit1",ylab="resid" #paneld plotfittedmpg.fit2,residmpg.fit2, xlab="fittedvaluesfromfit2",ylab="resid", main="d"

PAGE 246

3.2.NORMALLINEARMODELS 233 Figure3.11: mtcars a :mpgvs.wt; b :mpgvs.hp; c :residualplotfrom mpgwt ; d :residualplotfrom mpghp ; e :residualplotfrom mpgwt+hp

PAGE 247

3.2.NORMALLINEARMODELS 234 InExample3.7wetthreemodelsformpg,repeatedherewiththeiroriginal equationnumbers. mpg = 0 + 1 wt + 3.17 mpg = 0 + 1 hp + 3.18 mpg = 0 + 1 wt 1 + 2 hp + 3.19 Whatistheconnectionbetween,say, 1 and 1 ,orbetween 1 and 2 ? 1 isthe averagempgdifferencebetweentwocarswhoseweightsdifferby1000pounds. Sinceheaviercarstendtobedifferentthanlightercarsinmanyways,notjustin weight, 1 capturestheneteffectonmpgofallthosedifferences.Ontheother hand, 1 istheaveragempgdifferencebetweentwocarsofidenticalhorsepower butwhoseweightsdifferby1000pounds.Figure3.12showsthelikelihoodfunctionsofthesefourparameters.Theevidencesuggeststhat 1 isprobablyinthe rangeofabout-7toabout-4,while 1 isintherangeofabout-6to-2.It'spossible that 1 1 .Ontheotherhand, 1 isprobablyintheinterval )]TJ/F41 11.9552 Tf 9.299 0 Td [(: 1 ; )]TJ/F41 11.9552 Tf 9.299 0 Td [(: 04 while 2 isprobablyintheinterval )]TJ/F41 11.9552 Tf 9.298 0 Td [(: 05 ; 0 .It'squitelikelythat 1 6 2 .Scientists sometimesaskthequestionWhatistheeffectofvariable X onvariable Y ?That questiondoesnothaveanunambiguousanswer;theanswerdependsonwhich othervariablesareaccountedforandwhicharenot. Figure3.12wasproducedwith parmfrow=c,2 x<-seq-8,-1.5,len=60 plotx,dnormx,-5.3445,.5591,type="l", xlab=expressionbeta[1],ylab="",yaxt="n" x<-seq-.1,0,len=60 plotx,dnormx,-.06823,.01012,type="l", xlab=expressiongamma[1],ylab="",yaxt="n" x<-seq-8,-1.5,len=60 plotx,dnormx,-3.87783,.63273,type="l", xlab=expressiondelta[1],ylab="",yaxt="n" x<-seq-.1,0,len=60 plotx,dnormx,-.03177,.00903,type="l", xlab=expressiondelta[2],ylab="",yaxt="n"

PAGE 248

3.2.NORMALLINEARMODELS 235 Figure3.12:likelihoodfunctionsfor 1 1 1 and 2 inthe mtcars example.

PAGE 249

3.3.GENERALIZEDLINEARMODELS 236 3.3GeneralizedLinearModels 3.3.1LogisticRegression LookagainatpanelcinFigure3.1onpage203.Thedependentvariableisbinary, asopposedtothecontinuousdependentvariablesinpanelsa,bandd.Ina, banddwemodelled Y j X ashavingaNormaldistribution;regressionwasa modelfor E [ Y j X ] ,themeanofthatNormaldistributionasafunctionof X .Inc Y j X hasaBinomialdistribution.Westillusethetermregressionforamodelof E [ Y j X ] .When Y isbinary,regressionisamodelfortheprobabilityofsuccess as afunctionof X Figure3.13showstwomorescatterplotswhere Y isabinaryvariable.Thedata aredescribedinthenexttwoexamples. Example3.8 FACE,continued RefertoExamples1.12and2.11abouttheFACEexperimenttoassesstheeectsof excessCO 2 onthegrowthoftheforest.Todescribethesizeoftrees,ecologistssometimes useDiameteratBreastHeight,orDBH.DBHwasrecordedeveryyearforeachloblolly pinetreeintheFACEexperiment.OnepotentialeectofelevatedCO 2 isforthetreesto reachsexualmaturityandhencebeabletoreproduceearlierthanotherwise.Iftheydo matureearlier,ecologistswouldliketoknowwhetherthat'sdueonlytotheirincreased size,orwhethertreeswillreachmaturitynotjustatyoungerages,butalsoatsmaller sizes.Sexuallymaturetreescanproducepineconesbutimmaturetreescannot.Soto investigatesexualmaturity,agraduatestudentcountedthenumberofpineconeson eachtree.Foreachtreelet X beitsDBHand Y beeither1or0accordingtowhether thetreehaspinecones. Figure3.13aisaplotof Y versus X forallthetreesinRing1.Itdoesappearthat largertreesaremorelikelytohavepinecones. Example3.9 O-rings OnJanuary28,1986Americawasshockedbythedestructionofthespaceshuttle Challenger,andthedeathofitssevencrewmembers.Sobeginsthewebsite http: //www.fas.org/spp/51L.html oftheFederationofAmericanScientist's SpacePolicy Project. Upuntil1986thespaceshuttleorbiterwasliftedintospacebyapairofboosterrockets,oneoneachsideoftheshuttle,thatwerecomprisedoffoursectionsstackedvertically ontopofeachother.ThejointsbetweenthesectionsweresealedbyO-rings.OnJanuary 28,1986thetemperatureatlaunchtimewassocoldthattheO-ringsbecamebrittleand failedtosealthejoints,allowinghotexhaustgastocomeintocontactwithunburned

PAGE 250

3.3.GENERALIZEDLINEARMODELS 237 fuel.TheresultwastheChallengerdisaster.Aninvestigationensued.Thewebsite http://science.ksc.nasa.gov/shuttle/missions/51-l/docs containslinksto 1.adescriptionoftheevent, 2.areport Kerwin ontheinitialattempttodeterminethecause, 3.areport rogers-commission ofthepresidentialinvestigativecommissionthatnallydiddeterminethecause,and 4.atranscriptoftheoperationalrecordervoicetape. OneoftheissueswaswhetherNASAcouldorshouldhaveforseenthatcoldweather mightdiminishperformanceoftheO-rings. Afterlaunchtheboosterrocketsdetachfromtheorbiterandfallintotheoceanwhere theyarerecoveredbyNASA,takenapartandanalyzed.AspartoftheanalysisNASA recordswhetheranyoftheO-ringsweredamagedbycontactwithhotexhaustgas.If theprobabilityofdamageisgreaterincoldweatherthen,inprinciple,NASAmighthave forseenthepossibilityoftheaccidentwhichoccurredduringalaunchmuchcolderthan anypreviouslaunch. Figure3.13bplots Y = presenceofdamageagainst X = temperatureforthe launchespriortotheChallengeraccident.Theguredoessuggestthatcolderlaunches aremorelikelytohavedamagedO-rings.Whatiswantedisamodelforprobabilityof damageasafunctionoftemperature,andapredictionforprobabilityofdamageat 37 F, thetemperatureoftheChallengerlaunch. FittingstraightlinestoFigure3.13doesn'tmakesense.Inpanelawhatwe needisacurvesuchthat 1. E [ Y j X ]=P[ Y =1 j X ] iscloseto0when X issmallerthanabout10or 12cm.,and 2. E [ Y j X ]=P[ Y =1 j X ] iscloseto1when X islargerthanabout25or30cm. Inpanelbweneedacurvethatgoesintheoppositedirection. Themostcommonlyadoptedmodelinsuchsituationsis E [ Y j X ]=P[ Y =1 j X ]= e 0 + 1 x 1+ e 0 + 1 x .20 Figure3.14showsthesamedataasFigure3.13withsomecurvesaddedaccording toEquation3.20.Thevaluesof 0 and 1 areinTable3.3.1.

PAGE 251

3.3.GENERALIZEDLINEARMODELS 238 Figure3.13: a :pineconepresence/absencevs.dbh. b :O-ringdamagevs. launchtemperature

PAGE 252

3.3.GENERALIZEDLINEARMODELS 239 Figure3.14: a :pineconepresence/absencevs.dbh. b :O-ringdamagevs. launchtemperature,withsomelogisticregressioncurves

PAGE 253

3.3.GENERALIZEDLINEARMODELS 240 0 1 solid -8 .45 adashed -7.5 .36 dotted -5 .45 solid 20 -.3 bdashed 15 -.23 dotted 18 -.3 Table3.2: 'sforFigure3.14 Figure3.14wasproducedbythefollowingsnippet. parmfrow=c,1 plotcones$dbh[ring1],mature[ring1],xlab="DBH", ylab="pineconespresent",main="a" x<-seq4,25,length=40 b0<-c-8,-7.5,-5 b1<-c.45,.36,.45 foriin1:3 linesx,expb0[i]+b1[i]*x/+expb0[i]+b1[i]*x, lty=i plotorings$temp,orings$damage>0,xlab="temperature", ylab="damagepresent",main="b" x<-seq50,82,length=40 b0<-c,15,18 b1<-c-.3,-.23,-.3 foriin1:3 linesx,expb0[i]+b1[i]*x/+expb0[i]+b1[i]*x, lty=i Model3.20isknownas logisticregression .Letthe i 'thobservationhavecovariate x i andprobabilityofsuccess i = E [ Y i j x i ] .Dene i log i 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( i :

PAGE 254

3.3.GENERALIZEDLINEARMODELS 241 i iscalledthe logit of i .Theinversetransformationis i = e i 1+ e i : Thelogisticregressionmodelis i = 0 + 1 x i : Thisiscalleda generalizedlinearmodel or glm becauseitisalinearmodelfor ,a transformationof E Y j x ratherthanfor E Y j x directly.Thequantity 0 + 1 x iscalledthe linearpredictor .If 1 > 0 ,thenas x + 1 1 andas x ! 0 .If 1 < 0 thesituationisreversed. 0 islikeanintercept;itcontrolshowfar totheleftorrightthecurveis. 1 islikeaslope;itcontrolshowquicklythecurve movesbetweenitstwoasymptotes. Logisticregressionand,indeed,allgeneralizedlinearmodelsdifferfromlinear regressionintwoways:theregressionfunctionisnonlinearandthedistributionof Y j x isnotNormal.Thesedifferencesimplythatthemethodsweusedtoanalyze linearmodelsarenotcorrectforgeneralizedlinearmodels.Weneedtoderivethe likelihoodfunctionandndnewcalculationalalgorithms. Thelikelihoodfunctionisderivedfromrstprinciples. p Y 1 ;:::;Y n j x 1 ;:::;x n ; 0 ; 1 = Y i p Y i j x i ; 0 ; 1 = Y i y i i )]TJ/F41 11.9552 Tf 11.955 0 Td [( i 1 )]TJ/F42 7.9701 Tf 6.587 0 Td [(y i = Y i : y i =1 i Y i : y i =0 )]TJ/F41 11.9552 Tf 11.956 0 Td [( i = Y i : y i =1 e 0 + 1 x i 1+ e 0 + 1 x i Y i : y i =0 1 1+ e 0 + 1 x i Thisisarathercomplicatedfunctionofthetwovariables 0 ; 1 .However,a CentralLimitTheoremappliestogivealikelihoodfunctionfor 0 and 1 thatis accuratewhen n isreasonablelarge.Thetheoryisbeyondthescopeofthisbook, but R willdothecalculationsforus.Weillustratewiththepineconedatafrom Example3.8.Figure3.15showsthelikelihoodfunction.

PAGE 255

3.3.GENERALIZEDLINEARMODELS 242 Figure3.15:Likelihoodfunctionforthepineconedata

PAGE 256

3.3.GENERALIZEDLINEARMODELS 243 Figure3.15wasproducedbythefollowingsnippet. mature<-cones$X2000[ring1]>0 b0<-seq-11,-4,length=60 b1<-seq.15,.5,length=60 lik<-matrixNA,60,60 foriin1:60 forjin1:60{ linpred<-b0[i]+b1[j]*cones$dbh[ring1] theta<-explinpred/+explinpred lik[i,j]<-prodtheta^mature*-theta^-mature } lik<-lik/maxlik contourb0,b1,lik,xlab=expressionbeta[0], ylab=expressionbeta[1] mature isanindicatorvariableforwhetheratreehasatleastonepinecone. Thelines b0<-... and b1<-... setsomevaluesof 0 ; 1 atwhichto evaluatethelikelihood.Theywerechosenafterlookingattheoutputfrom ttingthelogisticregressionmodel. lik<-... createsamatrixtoholdvaluesofthelikelihoodfunction. linpred isthelinearpredictor.Because cones$dbh[ring1] isavector, linpred isalsoavector.Therefore theta isalsoavector,asis thetamature *-theta-mature .Itwillhelpyourunderstandingof R tounderstand whatthesevectorsare. OnenotablefeatureofFigure3.15isthediagonalslopeofthecontourellipses. Themeaningisthatwedonothaveindependentinformationabout 0 and 1 .For exampleifwethought,forsomereason,that 0 )]TJ/F15 11.9552 Tf 24.038 0 Td [(9 ,thenwecouldbefairly condentthat 1 isintheneighborhoodofabout.4toabout.45.Butifwethought 0 )]TJ/F15 11.9552 Tf 23.946 0 Td [(6 ,thenwewouldbelievethat 1 isintheneighborhoodofabout.25to about.3.Moregenerally,ifweknew 0 ,thenwecouldestimate 1 towithin arangeofabout.05.Butsincewedon'tknow 0 ,wecanonlysaythat 1 is likelytobesomewherebetweenabout.2and.6.Thedependentinformationfor 0 ; 1 meansthatourmarginalinformationfor 1 ismuchlessprecisethanour

PAGE 257

3.3.GENERALIZEDLINEARMODELS 244 conditionalinformationfor 1 given 0 .Thatimprecisemarginalinformationis reectedintheoutputfrom R ,showninthefollowingsnippetwhichtsthemodel andsummarizestheresult. cones<-read.table"data/pinecones.dat",header=T ring1<-cones$ring==1 mature<-cones$X2000[ring1]>0 fit<-glmmature~cones$dbh[ring1],family=binomial summaryfit ... Coefficients: EstimateStd.ErrorzvaluePr>|z| Intercept-7.466841.76004-4.2422.21e-05*** cones$dbh[ring1]0.361510.093313.8740.000107*** ... cones... readsinthedata.Thereisonelineforeachtree.Therstfew lineslooklikethis. ringIDxcoorycoorspecdbhX1998X1999X2000 11110030.710.53pita19.4000 21110041.262.36pita14.1004 31110111.446.16pita19.4060 ID isauniqueidentifyingnumberforeachtree; xcoor and ycoor arecoordinatesintheplane; spec isthespecies; pita standsfor pinustaeda orloblolly pine, X1998 X1999 and X2000 arethenumbersofpineconeseachyear. ring1... isanindicatorvariablefortreesinRing1. mature... indicateswhetherthetreehadanyconesatallin2000.Itisnot apreciseindicatorofmaturity. fit... tsthelogisticregression. glm tsageneralizedlinearmodel.The argument family=binomial tells R whatkindofdatawehave.Inthiscase it'sbinomialbecause y iseitherasuccessorfailure.

PAGE 258

3.3.GENERALIZEDLINEARMODELS 245 summaryfit showsthat ^ 0 ; ^ 1 )]TJ/F15 11.9552 Tf 9.298 0 Td [(7 : 5 ; 0 : 36 .TheSD'sareabout1.8and .1.Thesevaluesguidedthechoiceof b0 and b1 increatingFigure3.15.It's theSDofabout.1thatsayswecanestimate 1 towithinanintervalofabout .4,orabout 2 SD's. 3.3.2PoissonRegression Section3.3.1dealtwiththecasewheretheresponsevariable Y wasBernoulli. Anothercommonsituationiswheretheresponse Y isacount.Inthatcaseitis naturaltoadopt,atleastprovisionally,amodelinwhich Y hasaPoissondistribution: Y Poi .Whentherearecovariates X ,then maydependon X .Itis commontoadopttheregression log = 0 + 1 x .21 Model3.21isanotherexampleofageneralizedlinearmodel.Example3.10illustratesitsuse. Example3.10 Seedlings,continued SeveralearlierexampleshavediscusseddatafromtheCoweetaLTERontheemergence andsurvivalofredmaple acerrubrum seedlings.Example3.2showedthatthearrival rateofseedlingsseemedtovarybyquadrat.ReferespeciallytoFigure3.4.Example3.10 followsupthatobservationmorequantitatively. Roughlyspeaking,Newseedlingsariseinatwo-stepprocess.First,aseedfallsout ofthesky,thenitgerminatesandemergesfromtheground.Wemayreasonablyassume thattheemergenceofoneseedlingdoesnotaecttheemergenceofanotherThey're toosmalltointerferewitheachother.andhencethatthenumberofNewseedlings hasaPoi distribution.Let Y ij bethenumberofNewseedlingsobservedinquadrat i andyear j .Herearetwotsin R ,oneinwhich variesbyquadratandoneinwhichit doesn't. new<-data.framecount=count, quadrat=as.factorquadrat, year=as.factoryear fit0<-glmcount~1,family=poisson,data=new fit1<-glmcount~quadrat,family=poisson,data=new Thecommand data.frame createsa dataframe R describes dataframe sas

PAGE 259

3.3.GENERALIZEDLINEARMODELS 246 tightlycoupledcollectionsofvariableswhichsharemanyofthepropertiesofmatricesandoflists,usedasthefundamentaldatastructure bymostofR'smodelingsoftware. Wecreateda dataframe called new ,havingthreecolumnscalled count quadrat and year .Eachrowof new containsacountofNewseedlings,aquadratnumber andayear.Thereareasmanyrowsasthereareobservations. Thecommand as.factor turnsitsargumentintoa factor .Thatis,instead oftreating quadrat and year asnumericalvariables,wetreatthemasindicator variables.That'sbecausewedon'twanta quadrat variablerunningfrom1to 60implyingthatthe60thquadrathas60timesasmuchofsomethingasthe1st quadrat.Wewantthequadratnumberstoactaslabels,notasnumbers. glm standsforgeneralizedlinearmodel.The family=poisson argumentsays whatkindofdatawe'remodelling. data=new saysthedataaretobefoundina dataframe called new Theformula count1 saystotamodelwithonlyanintercept,nocovariates. Theformula countquadrat saystotamodelinwhich quadrat isacovariate. Ofcoursethat'sreally59newcovariates,indicatorvariablesfor59ofthe60 quadrats. Toexaminethetwotsandseewhichweprefer,weplottedactualversustted valuesandresidualsversusttedvaluesinFigure3.16.Panelsaandbarefrom fit0 .Becausetheremaybeoverplotting,wejitteredthepointsandreplottedthem inpanelscandd.Panelseandfarejitteredvaluesfrom fit1 .Comparison ofpanelsctoeanddtofshowsthat fit1 predictsmoreaccuratelyandhas smallerresidualsthan fit0 .That'sconsistentwithourreadingofFigure3.4.Sowe prefer fit1 Figure3.17continuesthestory.Panelashowsresidualsfrom fit1 plottedagainst year.Thereisacleardierencebetweenyears.Years1,3,and5arehighwhileyears2 and4arelow.Soperhapsweshoulduseyearasapredictor.That'sdoneby fit2<-glmcount~quadrat+year,family=poisson, data=new Panelsbandcshowdiagnosticplotsfor fit2 .ComparetosimilarpanelsinFigure3.16toseewhetherusingyearmakesanappreciabledierencetothet.

PAGE 260

3.3.GENERALIZEDLINEARMODELS 247 Figure3.16:Actualvs.ttedandresidualsvs.ttedfortheNewseedlingdata. a and b : fit0 c and d :jitteredvaluesfrom fit0 e and f :jitteredvalues from fit1 .

PAGE 261

3.3.GENERALIZEDLINEARMODELS 248 Figure3.16wascreatedwiththefollowingsnippet. parmfrow=c,2 plotfittedfit0,new$count,xlab="fittedvalues", ylab="actualvalues",main="a" abline0,1 plotfittedfit0,residualsfit0,xlab="fittedvalues", ylab="residuals",main="b" plotjitterfittedfit0,jitternew$count, xlab="fittedvalues",ylab="actualvalues", main="c" abline0,1 plotjitterfittedfit0,jitterresidualsfit0, xlab="fittedvalues",ylab="residuals",main="d" plotjitterfittedfit1,jitternew$count, xlab="fittedvalues",ylab="actualvalues", main="e" abline0,1 plotjitterfittedfit1,jitterresidualsfit1, xlab="fittedvalues",ylab="residuals",main="f" ThefollowingsnippetshowshowFigure3.17wasmadein R parmfrow=c,2 plotnew$year,residualsfit1, xlab="year",ylab="residuals",main="a" plotjitterfittedfit2,jitternew$count, xlab="fittedvalues",ylab="actualvalues", main="b" abline0,1 plotjitterfittedfit2,jitterresidualsfit2, xlab="fittedvalues",ylab="residuals",main="c"

PAGE 262

3.3.GENERALIZEDLINEARMODELS 249 Figure3.17:Newseedlingdata. a :residualsfrom fit1 vs.year. b :actualvs. ttedfrom fit2 c :residualsvs.ttedfrom fit2 .

PAGE 263

3.4.PREDICTIONSFROMREGRESSION 250 3.4PredictionsfromRegression Fromaregressionequation,ifwehaveestimatesofthe 'swecan 1.pluginthevaluesofthe x 'swehavetoget ttedvalues ,and 2.pluginthevaluesof x forneworfuturecasestoget predictedvalues Weillustratewiththe mtcars data. Example3.11 mtcars,continued Example3.7concludedwithacomparisonofthreemodelsformpg.Herewecontinue thatcomparisonbyseeingwhetherthemodelsmakesubstantiallydierentpredictions foranyofthecarsinthedataset.Foreachcarweknowitsweightandhorsepowerand wehaveestimatesofalltheparametersinEquations3.17,3.18,and3.19,sowecan computeitsttedvaluesfromallthreemodels.Insymbols, ^ y i = ^ 0 + ^ 1 wt i from3.17 ^ y i =^ 0 +^ 1 hp i from3.18 ^ y i = ^ 0 + ^ 1 wt i + ^ 2 hp i from3.19 Weplotthettedvaluesagainsteachothertoseewhetherthereareanynoticable dierences.Figure3.18displaystheresult.Figure3.18showsthatthe mpg.fit1 and mpg.fit3 producettedvaluessubstantiallysimilartoeachotherandagreeingfairlywell withactualvalues,while mpg.fit2 producesttedvaluesthatdiersomewhatfromthe othersandfromtheactualvalues,atleastforafewcars.Thisisanotherreasontoprefer mpg.fit1 and mpg.fit3 to mpg.fit2 .InExample3.7thislackoftshowedupasa higher ^ for mpg.fit2 thanfor mpg.fit1 Figure3.18wasmadewiththefollowingsnippet. fitted.mpg<-cbindfittedmpg.fit1,fittedmpg.fit2, fittedmpg.fit3,mtcars$mpg pairsfitted.mpg,labels=c"fittedfromwt", "fittedfromhp","fittedfromboth","actualmpg" fittedxyz extractsttedvalues. xyz canbeanymodelpreviouslyttedby lm glm ,orother R functionstotmodels.

PAGE 264

3.4.PREDICTIONSFROMREGRESSION 251 Figure3.18:Actualmpgandttedvaluesfromthreemodels

PAGE 265

3.4.PREDICTIONSFROMREGRESSION 252 InExample3.5wepositedmodel3.10: y = 0 + 1 x + .22 where x wasmeantemperatureduringtheweekand y wasicecreamconsumption duringtheweek.Nowwewanttotthemodeltothedataandusethettopredict consumption.Inaddition,wewanttosayhowaccuratethepredictionsare.Let x f bethepredictedmeantemperatureforsomefutureweekand y f beconsumption. x f isknown; y f isnot.Ourmodelsays y f N f ; where f = 0 + 1 x f f isunknownbecause 0 and 1 areunknown.But 0 ; 1 canbeestimatedfrom thedata,sowecanformanestimate ^ f = ^ 0 + ^ 1 x f Howaccurateis ^ f asanestimateof f ?Theanswerdependsonhowaccurate ^ 0 ; ^ 1 areasestimatesof 0 ; 1 .AdvancedtheoryaboutNormaldistributions, beyondthescopeofthisbook,tellsus ^ f N f ; t forsome t whichmaydependon x f ;wehaveomittedthedependencyfromthe notation. f istheaverageicecreamconsumptioninallweekswhosemeantemperature is x f .So ^ f isalsoanestimatorof y f .Butinanyparticularweektheactual consumptionwon'texactlyequal f .Ourmodelsays y f = f + where N ; .Soinanygivenweek y f willdifferfrom f byanamountupto about 2 orso. Thustheuncertainty t inestimating y f hastwocomponents:theuncertaintyof f whichcomesbecausewedon'tknow 0 ; 1 andthevariability dueto .Wecan'tsayinadvancewhichcomponentwilldominate.Sometimesit willbetherst,sometimesthesecond.Whatwecansayisthataswecollectmore andmoredata,welearn 0 ; 1 moreaccurately,sotherstcomponentbecomes negligibleandthesecondcomponentdominates.Whenthathappens,wewon'tgo farwrongbysimplyignoringtherstcomponent.

PAGE 266

3.5.EXERCISES 253 3.5Exercises 1.aUsethe attenu airquality and faithful datasetstoreproduceFigures3.1a,bandd. bAdd lowess and supsmu ts. cFigureouthowtousethetuningparametersandtryoutseveraldifferent values.Usethe help or help.start functions. 2.Withthe mtcars dataset,useascatterplotsmoothertoplottherelationship betweenweightanddisplacement.Doesitmatterwhichwethinkofas X andwhichas Y ?Isonewaymorenaturalthantheother? 3.Downloadthe1970draftdatafrom DASL andreproduceFigure3.3.Usethe tuningparameters f for lowess ; span for supsmu todrawsmootherand wigglierscatterplotsmoothers. 4.HowcouldyoutestwhetherthedraftnumbersinExample3.1weregenerateduniformly?WhatwouldH 0 be?Whatwouldbeagoodteststatistic w ? Howwouldestimatethedistributionof w underH 0 ? 5.UsingtheinformationinExample3.6estimatethemeancaloriecontentof meatandpoultryhotdogs. 6.RefertoExamples2.2,3.4,and3.6. aFormulatestatisticalhypothesesfortestingwhetherthemeancalorie contentofPoultryhotdogsisequaltothemeancaloriecontentofBeef hotdogs. bWhatstatisticwillyouuse? cWhatshouldthatstatisticbeifH 0 istrue? dHowmanySD'sisitoff? eWhatdoyouconclude? fWhataboutMeathotdogs? 7.RefertoExamples2.2,3.4,and3.6.Figure3.5showsplentyofoverlapin thecaloriecontentsofBeefandPoultryhotdogs.I.e.,therearemanyPoultry hotdogswithmorecaloriesthanmanyBeefhotdogs.ButFigure3.9shows verylittlesupportforvaluesof P near0.Canthatberight?Explain?

PAGE 267

3.5.EXERCISES 254 8.Examples2.2,3.4,and3.6analyzethecaloriecontentofBeef,Meat,and Poultryhotdogs.Createasimilaranalysis,butforsodiumcontent.Your analysisshouldcoveratleastthefollowingsteps. aAstripchartsimilartoFigure3.5anddensityestimatessimilartoFigure3.6. bAmodelsimilartoModel3.4,includingdenitionsoftheparameters. cIndicatorvariablesanalogoustothoseinEquation3.6. dAmodelsimilartoModel3.7,includingdenitionsofalltheterms. eAtin R ,similartothatinExample3.6. fParameterestimatesandSD's. gPlotsoflikelihoodfunctions,analagoustoFigure3.9. hInterpretation. 9.Analyzethe PlantGrowth datafrompage214.Stateyourconclusionabout whetherthetreatmentsareeffective.Supportyouconclusionwithanalysis. 10.Analyzethe IceCream datafromExample3.5.Writeamodelsimilarto Model3.7,includingdenitionsofalltheterms.Use R totthemodel. Estimatethecoefcientsandsayhowaccurateyourestimatesare.Iftemperatureincreasesbyabout5 F,abouthowmuchwouldyouexpecticecream consumptiontoincrease?MakeaplotsimilartoFigure3.8,butaddonthe lineimpliedbyEquation3.10andyourestimatesof 0 and 1 11.VerifytheclaimthatforEquation3.18 ^ 0 30 ^ 1 )]TJ/F41 11.9552 Tf 21.918 0 Td [(: 07 and ^ 3 : 9 12.Doesafootballlledwithheliumtravelfurtherthanonelledwithair? DASL hasadatasetthatattemptstoanswerthequestion.Goto DASL http: //lib.stat.cmu.edu/DASL ,downloadthedataset Heliumfootball andread thestory.Usewhatyouknowaboutlinearmodelstoanalyzethedataand reachaconclusion.Youmustdecidewhethertoincludedatafromtherst severalkicksandfromkicksthatappeartobeubbed.Doesyourdecision affectyourconclusion? 13.Usethe PlantGrowth datafrom R .Refertopage214andEquation3.5. aEstimate C T 1 T 2 and bTestthehypothesis T 1 = C .

PAGE 268

3.5.EXERCISES 255 cTestthehypothesis T 1 = T 2 14.JackandJill,twoDukesophomores,havetochoosetheirmajors.Theyboth lovepoetrysotheymightchoosetobeEnglishmajors.Thentheirfutures wouldbefullofblackclothes,blackcoffee,lowpayingjobs,andoccasional volumesofpoetrypublishedbyindependent,non-commercialpresses.On theotherhand,theybothseethevalueofmoney,sotheycouldchoosetobe Economicsmajors.Thentheirfutureswouldbefullofpowersuits,double cappucinos,investmentbankingand,atleastforJack,membershipinthe AugustaNationalgolfclub.Butwhichwouldmakethemmorehappy? Toinvestigate,theyconductasurvey.Notwantingtoembarasstheirfriends andthemselves,JackandJillgoupChapelHilltointerviewpoetsandinvestmentbankers.InallofChapelHillthereare90poetsbutonly10investment bankers.J&Jinterviewthemall.FromtheinterviewsJ&Jcomputethe HappinessQuotient orHQofeachsubject.TheHQ'sareinFigure3.19.J&Jalso recordtwoindicatorvariablesforeachperson: P i =1 or 0 forpoetsand bankers; B i =1 or 0 forbankersandpoets. JillandJackeachwriteastatisticalmodel: Jill:HQ i = 0 + 1 P i + i Jack:HQ i = 1 P i + 2 B i + i aSayinwordswhatare 0 1 1 and 2 bExpress 1 and 2 intermsof 0 and 1 cIntheirdatasetJ&Jnd HQ =43 amongpoets, HQ =44 among bankersand ^ 2 =1 .Subjectsreportdisappointmentwiththeirfavorite basketballteamastheprimaryreasonforlowHQ.Findsensiblenumericalestimatesof 0 1 1 and 2 15.Ispovertyrelatedtoacademicperformanceinschool?Thele schools_poverty atthistext'swebsitecontainsrelevantdatafromtheDurham,NCschool systemin2001.Therstfewlinesare pfleogtype 16665e 23273m 36565e

PAGE 269

3.5.EXERCISES 256 Figure3.19:HappinessQuotientofbankersandpoets

PAGE 270

3.5.EXERCISES 257 EachschoolintheDurhampublicschoolsystemisrepresentedbyoneline inthele.Thevariable pfl standsfor percentfreelunch .Itrecordsthe percentageoftheschool'sstudentpopulationthatqualiesforafreelunch program.Itisanindicatorofpoverty.Thevariable eog standsfor endof grade .Itistheschool'saveragescoreonendofgradetestsandisanindicator ofacademicsuccess.Finally, type indicatesthetypeofschool e m ,or h forelementary,middleorhighschool,respectively.Youaretoinvestigate whether pfl ispredictiveof eog aReadthedatainto R andplotitinasensibleway.Usedifferentplot symbolsforthethreetypesofschools. bDoesthereappeartobearelationshipbetween pfl and eog ?Isthe relationshipthesameforthethreetypesofschools?Decidewhetherthe restofyouranalysisshouldincludealltypesofschools,oronlyoneor two. cUsingthetypesofschoolsyouthinkbest,remaketheplotandadda regressionline.Sayinwordswhattheregressionlinemeans. dDuringthe2000-2001schoolyearDukeUniversity,inDurham,NC, sponsoredatutoringprograminoneoftheelementaryschools.Many Dukestudentsservedastutors.Fromlookingattheplot,andassuming theprogramwassuccessful,canyougureoutwhichschoolitwas? 16.Load mtcars intoan R session.Use R tondthem.l.e.'s ^ 0 ; ^ 1 .Conrm thattheyagreewiththelinedrawninFigure3.11 a .StartingfromEquation3.17,derivethem.l.e.'sfor 0 and 1 17.Getmorecurrentdatasimilarto mtcars .Carryoutaregressionanalysis similartoExample3.7.Haverelationshipsamongthevariableschangedover time?Whatarenowthemostimportantpredictorsofmpg? 18.Repeatthelogisticregressionof am on wt ,butuse hp insteadof wt 19.AresearcherrandomlyselectscitiesintheUS.Foreachcitysherecordsthe numberofbars y i andthenumberofchurches z i .Intheregressionequation z i = 0 + 1 y i doyouexpect 1 tobepositive,negative,oraround0? 20. Jevons'coins? 21.aJanewritesthefollowing R code:

PAGE 271

3.5.EXERCISES 258 x<-runif60,-1,1 Describe x .Isitanumber,avector,oramatrix?Whatisinit? bNowshewrites y<-x+rnorm60 myfit<-lmy~x Makeanintelligentguessofwhatshefoundfor ^ 0 and ^ 1 cUsingadvancedstatisticaltheoryshecalculates SD ^ 0 = : 13 SD ^ 1 = : 22 Finallyshewrites in0<-0 in1<-0 foriin1:100{ x<-runif60,-1,1 y<-x+rnorm60 fit<-lmy~x ifabsfit$coef[1]<=.26in0<-in0+1 ifabsfit$coef[2]-1<=.44in1<-in1+1 } Makeanintelligentguessof in0 and in1 afterJaneranthiscode. 22.TheArmyistestinganewmortar.Theyreashellupatanangleof 60 and trackitsprogresswithalaser.Let t 1 ;t 2 ;:::;t 100 beequallyspacedtimesfrom t 1 = timeofringto t 100 = timewhenitlands.Let y 1 ;:::;y 100 bethe shell'sheightsand z 1 ;:::;z 100 betheshell'sdistancefromthehowitzermeasuredhorizontallyalongthegroundattimes t 1 ;t 2 ;:::;t 100 .The y i 'sand z i 's aremeasuredbythelaser.Themeasurementsarenotperfect;thereissome measurementerror.Inansweringthefollowingquestionsyoumayassume thattheshell'shorizontalspeedremainsconstantuntilitfallstoground.

PAGE 272

3.5.EXERCISES 259 a True or False :Theequation y i = 0 + 1 t i + i shouldtthedatawell. b True or False :Theequation y i = 0 + 1 t i + 2 t 2 i + i .23 shouldtthedatawell. c True or False :Theequation z i = 0 + 1 t i + i .24 shouldtthedatawell. d True or False :Theequation z i = 0 + 1 t i + 2 t 2 i + i .25 shouldtthedatawell. e True or False :Theequation y i = 0 + 1 z i + i .26 shouldtthedatawell. f True or False :Theequation y i = 0 + 1 z i + 2 z 2 i + i .27 shouldtthedatawell. gApproximatelywhatvaluedidtheArmyndfor ^ 0 inPartb? hApproximatelywhatvaluedidtheArmyndfor ^ 2 inPartd? 23.Somenonstatisticiansnotreadersofthisbook,wehopedostatisticalanalysesbasedalmostsolelyonnumericalcalculationsanddon'tuseplots. R comeswiththedataset anscombe whichdemonstratesthevalueofplots. Type dataanscombe toloadthedataintoyour R session.Itisan11by8 dataframe .Thevariablenamesare x1 x2 x3 x4 y1 y2 y3 ,and y4 .

PAGE 273

3.5.EXERCISES 260 aStartwith x1 and y1 .Use lm tomodel y1 asafunctionof x1 .Printa summaryoftheregressionsoyoucansee ^ 0 ^ 1 ,and ^ bDothesamefortheotherpairs: x2 and y2 x3 and y3 x4 and y4 cWhatdoyouconcludesofar? dPlot y1 versus x1 .Repeatforeachpair.Youmaywanttoputallfour plotsonthesamepage.It'snotnecessary,butyoushouldknowhowto drawtheregressionlineoneachplot.Doyou? eWhatdoyouconclude? fAreanyofthesepairswelldescribedbylinearregression?Howwould youdescribetheothers?Iftheotherswerenotarticiallyconstructed data,butwerereal,howwouldyouanalyzethem? 24.Here'ssome R code: x<-rnorm1,2,3 y<--2*x+1+rnorm1,0,1 aWhatisthemarginaldistributionof x ? bWritedownthemarginaldensityof x cWhatistheconditionaldistributionof y given x ? dWritedowntheconditionaldensityof y given x eWritedownthejointdensityof x,y Here'smore R code: N.sim<-1000 w<-repNA,N.sim foriin1:N.sim{ x<-rnorm50,2,3 y<--2*x+1+rnorm50,0,1 fit<-lmy~x w[i]<-fit$coef[2] } z1<-meanw z2<-sqrtvarw+2

PAGE 274

3.5.EXERCISES 261 Whatdoes z1 estimate?Whatdoes z2 estimate? 25.Astatisticianthinkstheregressionequation y i = 0 + 1 x i + i tsherdata well.Shewouldliketolearn 1 .Sheisabletomeasurethe y i 'saccuratelybut canmeasurethe x i 'sonlyapproximately.Infact,shecanmeasure w i = x i + i where i N ;: 1 .Soshecanttheregressionequation y i = 0 + 1 w i + i Notethat 0 ; 1 mightbedifferentthan 0 ; 1 becausethey'reforthe w i 's, notthe x i 's.Sothestatisticianwritesthefollowing R code. N.sim<-1000 b.0<--10:10 b.1<--10:10 n<-50 foriin1:21 forjin1:21{ val<-repNA,N.sim forkin1:N.sim{ x<-rnormn w<-x+rnormn,0,sqrt.1 y<-b.0[i]+x*b.1[j]+rnormn,3 fit<-lmy~w val[k]<-fit$coef[2] } m<-meanval sd<-sqrtvarval printcm,sd,m-b.1[j]/sd } Whatisshetryingtodo?Thelasttimethroughtheloop,theprintstatement yields [1]9.0867230.434638-2.101237 .Whatdoesthisshow? 26.Thepurposeofthisexerciseistofamiliarizeyourselfwithplottinglogistic regressioncurvesandgettingafeelforthemeaningof 0 and 1 aChoosesomevaluesof x .Youwillwantbetweenabout20and100 evenlyspacedvalues.Thesewillbecometheabscissaofyourplot. bChoosesomevaluesof 0 and 1 .Youaretryingtoseehowdifferent valuesofthe 'saffectthecurve.Soyoumightbeginwithasinglevalue of 1 andseveralvaluesof 0 ,or viceversa .

PAGE 275

3.5.EXERCISES 262 cForeachchoiceof 0 ; 1 calculatethesetof i = e 0 + 1 x i = + e 0 + 1 x i andplot i versus x i .Youshouldgetsigmoidalshapedcurves.Theseare logisticregressioncurves. dYoumayndthattheparticular x 'sand 'syouchosedonotyielda visuallypleasingresult.Perhapsallyour 'saretoocloseto0ortoo closeto1.Inthatcase,gobackandchoosedifferentvalues.Youwill havetoplayarounduntilyound x 'sand 'scompatiblewitheach other. 27.CarryoutalogisticregressionanalysisoftheO-ringdata.Whatdoesyour analysissayabouttheprobabilityofO-ringdamageat 36 F,thetemperature oftheChallengerlaunch.Howrelevantshouldsuchananalysishavebeento thedecisionofwhethertopostponethelaunch? 28.ThisexercisereferstoExample3.10. aWhyarethepointslinedupverticallyinFigure3.16,panels a and b ? bWhydopanels c and d appeartohavemorepointsthanpanels a and b ? cIftherewerenojittering,howmanydistinctvalueswouldtherebeon theabscissaofpanels c and d ? dDownloadtheseedlingdata.Fitamodelinwhichyearisapredictorbut quadratisnot.Compareto fit1 .Whichdoyouprefer?Whichvariable ismoreimportant:quadratoryear?Oraretheybothimportant?

PAGE 276

C HAPTER 4 M ORE P ROBABILITY 4.1MoreProbabilityDensity Section1.2onpage6introducedprobabilitydensities.Section4.1discussesthem furtherandgivesaformaldenition. Let X beacontinuousrandomvariablewithcdf F X .Theorem1.2onpage8 impliesthat f X x = d db F X b b = x andthereforethatwecandenethepdfby f X x F 0 X x = d db F X b b = x .Infact,thisdenitionisalittletoorestrictive.The keypropertyofpdf'sisthattheprobabilityofaset A isgivenbytheintegralofthe pdf.I.e., P[ X 2 A ]= Z A f X x dx Butif f isafunctionthatdiffersfrom f X atonlycountablymanypointsthen,for anyset A R A f = R A f X ,sowecouldjustaswellhavedened P[ X 2 A ]= Z A f x dx Thereareinnitelymanyfunctionshavingthesameintegralsas f X and f .These functionsdifferfromeachotheronsetsofmeasurezero,terminologybeyondour scopebutdenedinbooksonmeasuretheory.Forourpurposeswecanthinkof setsofmeasurezeroassetscontainingatmostcountablymanypoints.Ineffect, thepdfof X canbearbitrarilychangedonsetsofmeasurezero.Itdoesnotmatter whichofthemanyequivalentfunctionsweuseastheprobabilitydensityof X Thus,wedene 263

PAGE 277

4.2.RANDOMVECTORS 264 Denition4.1. Anyfunction f suchthat,forallintervals A P[ X 2 A ]= Z A f x dx iscalleda probabilitydensityfunction ,or pdf ,fortherandomvariable X .Anysuch functionmaybedenoted f X Denition4.1canbeusedinanalternateproofofTheorem1.1onpage12. Thecentralstepintheproofisjustachange-of-variableinanintegral,showing thatTheorem1.1is,inessence,justachangeofvariables.Forconveniencewe restatethetheorembeforereprovingit. Theorem1.1 Let X bearandomvariablewithpdf p X .Let g beadifferentiable, monotonic,invertiblefunctionanddene Z = g X .Thenthepdfof Z is p Z t = p X g )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 t dg )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t dt Proof. Foranyset A P[ Z 2 g A ]=P[ X 2 A ]= R A p X x dx .Let z = g x and changevariablesintheintegraltoget P[ Z 2 g A ]= Z g A p X g )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 z dx dz dz I.e., P[ Z 2 g A ]= R g A something dz .Therefore something mustbe p Z z .Hence, p Z z = p X g )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 z j dx=dz j 4.2RandomVectors Itisoftenuseful,evenessential,totalkaboutseveralrandomvariablessimultaneously.WehaveseenmanyexamplesthroughoutthetextbeginningwithSection1.5 onjoint,marginal,andconditionalprobabilities.Section4.2reviewsthebasicsand setsoutnewprobabilitytheoryformultiplerandomvariables. Let X 1 ,..., X n beasetof n randomvariables.The n -dimensionalvector ~ X = X 1 ;:::;X n iscalledamultivariaterandomvariableor randomvector .As explainedbelow, ~ X hasapdforpmf,acdf,anexpectedvalue,andacovariance matrix,allanalogoustounivariaterandomvariables.

PAGE 278

4.2.RANDOMVECTORS 265 4.2.1DensitiesofRandomVectors When X 1 ,..., X n arecontinuousthen ~ X hasapdf,written p ~ X x 1 ;:::;x n : Asintheunivariatecase,thepdfisanyfunctionwhoseintegralyieldsprobabilities. Thatis,if A isaregionin R n then P[ ~ X 2 A ]= Z Z A p ~ X x 1 ; ;x n dx 1 :::dx n Forexample,let X 1 Exp ; X 2 Exp = 2 ; X 1 ? X 2 ;and ~ X = X 1 ;X 2 and supposewewanttond P[ j X 1 )]TJ/F41 11.9552 Tf 12.137 0 Td [(X 2 j 1] .Ourplanforsolvingthisproblemisto ndthejointdensity p ~ X ,thenintegrate p ~ X overtheregion A where j X 1 )]TJ/F41 11.9552 Tf 11.377 0 Td [(X 2 j 1 Because X 1 ? X 2 ,thejointdensityis p ~ X x 1 ;x 2 = p X 1 x 1 p X 2 x 2 = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x 1 1 2 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x 2 = 2 Tondtheregion A overwhichtointegrate,ithelpstoplotthe X 1 X 2 plane. Makingtheplotisleftasanexercise. P[ j X 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(X 2 j 1]= ZZ A p ~ X x 1 ;x 2 dx 1 dx 2 = 1 2 Z 1 0 Z x 1 +1 0 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x 1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x 2 = 2 dx 2 dx 1 + 1 2 Z 1 1 Z x 1 +1 x 1 )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x 1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x 2 = 2 dx 2 dx 1 0 : 47 .1 Therandomvariables X 1 ;:::;X n aresaidtobe mutuallyindependent or jointly independent if p ~ X x 1 ;:::;x n = p X 1 x 1 p X n x n forallvectors x 1 ;:::;x n Mutualindependenceimpliespairwiseindependence.I.e.,if X 1 ;:::;X n are mutuallyindependent,thenanypair X i ;X j arealsoindependent.Theproof isleftasanexercise.Itiscuriousbuttruethatpairwiseindependencedoesnot implyjointindependence.Foranexample,considerthediscretethree-dimensional

PAGE 279

4.2.RANDOMVECTORS 266 distributionon ~ X = X 1 ;X 2 ;X 3 with P[ X 1 ;X 2 ;X 3 = ; 0 ; 0] =P[ X 1 ;X 2 ;X 3 = ; 0 ; 1] =P[ X 1 ;X 2 ;X 3 = ; 1 ; 1] =P[ X 1 ;X 2 ;X 3 = ; 1 ; 0]=1 = 4 .2 Itiseasilyveriedthat X 1 ? X 2 X 1 ? X 3 ,and X 2 ? X 3 butthat X 1 X 2 ,and X 3 arenotmutuallyindependent.SeeExercise6. 4.2.2MomentsofRandomVectors When ~ X isarandomvector,itsexpectedvalueisalsoavector. E [ ~ X ] E [ X 1 ] ;:::; E [ X n ] When ~ X X 1 ;:::;X n isarandomvector,insteadofavarianceithasa covariance matrix .The ij 'thentryofthecovariancematrixis Cov X i ;X j .Thenotationis Cov ~ X ~ X = 2 6 6 6 4 2 1 12 1 n 12 2 2 2 n . . . . . . 1 n 2 n 2 n 3 7 7 7 5 where ij =Cov X i ;X j and 2 i =Var X i .Sometimes 2 i isalsodenoted ii 4.2.3FunctionsofRandomVectors Section4.2.3considersfunctionsofrandomvectors.If g isanarbitraryfunction thatmaps ~ X to R then E [ g ~ X ]= Z Z g x 1 ;:::;x n p ~ X x 1 ;:::;x n dx 1 dx n butit'shardtosaymuchingeneralaboutthevarianceof g ~ X .When g isalinear functionwecangofarther,butrstweneedalemma. Lemma4.1. Let X 1 and X 2 berandomvariablesand Y = X 1 + X 2 .Then 1. E [ Y ]= E [ X 1 ]+ E [ X 2 ]

PAGE 280

4.2.RANDOMVECTORS 267 2. Var Y =Var X 1 +Var X 2 +2Cov X 1 ;X 2 Proof. Leftasexercise. Nowwecandealwithlinearcombinationsofrandomvectors. Theorem4.2. Let ~a = a 1 ;:::;a n bean n -dimensionalvectoranddene Y = ~a t ~ X = P a i X i .Then, 1. E [ Y ]= E [ P a i X i ]= P a i E [ X i ] 2. Var Y = P a 2 i Var X i +2 P n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 i =1 P n j = i +1 a i a j Cov X i ;X j = ~a t ~ X ~a Proof. UseLemma4.1andTheorems1.3pg.39and1.4pg.40.SeeExercise8. Thenextstepistoconsiderseverallinearcombinationssimultaneously.For some k n ,andforeach i =1 ;:::;k ,let Y i = a i 1 X 1 + a in X n = X j a ij X j = ~a t i ~ X wherethe a ij 'sarearbitraryconstantsand ~a i = a i 1 ;:::;a in .Let ~ Y = Y 1 ;:::;Y k Inmatrixnotation, ~ Y = A ~ X where A isthe k n matrixofelements a ij .Covariancesofthe Y i 'saregivenby Cov Y i ;Y j =Cov ~a t i ~ X;~a t j ~ X = n X k =1 n X ` =1 Cov a ik X k ;a j` X j = n X k =1 a ik a jk 2 k + n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X k =1 n X ` = k +1 a ik a j` + a jk a i` k` = ~a t i ~ X ~a j CombiningthepreviousresultwithTheorem4.2yieldsTheorem4.3. Theorem4.3. Let ~ X bearandomvectorofdimension n withmean E [ ~ X ]= and covariancematrix Cov ~ X = ;let A bea k n matrixofrank k ;andlet ~ Y = A ~ X Then

PAGE 281

4.2.RANDOMVECTORS 268 1. E [ ~ Y ]= A ,and 2. Cov ~ Y = A A 0 Finally,wetakeupthequestionofmultivariatetransformations,extending theunivariateversion,Theorem1.1pg.12.Let ~ X = X 1 ;:::;X n bean n dimensionalcontinuousrandomvectorwithpdf f ~ X .Deneanew n -dimensional randomvector ~ Y = Y 1 ;:::;Y n = g 1 ~ X ;:::;g n ~ X wherethe g i 'saredifferentiablefunctionsandwherethethetransformation g : ~ X 7! ~ Y isinvertible.Whatis f ~ Y ,thepdfof ~ Y ? Let J betheso-called Jacobian matrixofpartialderivatives. J = 0 B B B @ @Y 1 @X 1 @Y 1 @X n @Y 2 @X 1 @Y 2 @X n . . . . @Y n @X 1 @Y n @X n 1 C C C A and j J j betheabsolutevalueofthedeterminantof J Theorem4.4. f ~ Y ~y = f ~ X g )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 y j J j )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 Proof. TheprooffollowsthealternateproofofTheorem1.1onpage264.Forany set A P[ ~ Y 2 g A ]=P[ ~ X 2 A ]= R R A p ~ X ~x dx 1 dx n .Let ~y = g ~x and changevariablesintheintegraltoget P[ ~ Y 2 g A ]= Z Z g A p ~ X g )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ~y j J j )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 dy 1 dy n I.e., P[ ~ Y 2 g A ]= R R g A something dy 1 dy n .Therefore something mustbe p ~ Y ~y .Hence, p ~ Y ~y = p ~ X g )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ~y j J j )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ToillustratetheuseofTheorem4.4wesolveagainanexamplepreviouslygiven onpage265,whichwerestatehere.Let X 1 Exp ; X 2 Exp ; X 1 ? X 2 ; and ~ X = X 1 ;X 2 andsupposewewanttond P[ j X 1 )]TJ/F41 11.9552 Tf 12.292 0 Td [(X 2 j 1] .Wesolvedthis problempreviouslybyndingthejointdensityof ~ X = X 1 ;X 2 ,thenintegrating overtheregionwhere j X 1 )]TJ/F41 11.9552 Tf 13.069 0 Td [(X 2 j 1 .Ourstrategythistimeistodenenew variables Y 1 = X 1 )]TJ/F41 11.9552 Tf 11.999 0 Td [(X 2 and Y 2 ,whichisessentiallyarbitrary,ndthejointdensity of ~ Y = Y 1 ;Y 2 ,thenintegrateovertheregionwhere j Y 1 j 1 .Wedene Y 1 = X 1 )]TJ/F41 11.9552 Tf 12.979 0 Td [(X 2 becausethat'sthevariablewe'reinterestedin.Weneeda Y 2 because

PAGE 282

4.2.RANDOMVECTORS 269 Theorem4.4isforfullranktransformationsfrom R n to R n .Theprecisedenition of Y 2 isunimportant,aslongasthetransformationfrom ~ X to ~ Y isdifferentiable andinvertible.Forconvenience,wedene Y 2 = X 2 .Withthesedenitions, J = @Y 1 @X 1 @Y 1 @X 2 @Y 2 @X 1 @Y 2 @X 2 = 1 )]TJ/F15 11.9552 Tf 9.299 0 Td [(1 01 j J j =1 X 1 = Y 1 + Y 2 and X 2 = Y 2 Fromthesolutiononpage265weknow p ~ X x 1 ;x 2 = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x 1 1 2 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x 2 = 2 ,so p ~ Y y 1 ;y 2 = e )]TJ/F39 7.9701 Tf 6.587 0 Td [( y 1 + y 2 1 2 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y 2 = 2 = 1 2 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y 1 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(3 y 2 = 2 .Figure4.1showstheregionoverwhichtointegrate. P[ j X 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(X 2 j 1]=P[ j Y 1 j 1]= ZZ A p ~ Y y 1 ;y 2 dy 1 dy 2 = 1 2 Z 0 )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y 1 Z 1 )]TJ/F42 7.9701 Tf 6.586 0 Td [(y 1 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(3 y 2 = 2 dy 2 dy 1 + 1 2 Z 1 0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y 1 Z 1 0 e )]TJ/F39 7.9701 Tf 6.587 0 Td [(3 y 2 = 2 dy 2 dy 1 = 1 3 Z 0 )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y 1 )]TJ/F41 11.9552 Tf 9.298 0 Td [(e )]TJ/F39 7.9701 Tf 6.586 0 Td [(3 y 2 = 2 1 )]TJ/F42 7.9701 Tf 6.587 0 Td [(y 1 dy 1 + 1 3 Z 1 0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y 1 )]TJ/F41 11.9552 Tf 9.299 0 Td [(e )]TJ/F39 7.9701 Tf 6.587 0 Td [(3 y 2 = 2 1 0 dy 1 = 1 3 Z 0 )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e y 1 = 2 dy 1 + 1 3 Z 1 0 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y 1 dy 1 = 2 3 e y 1 = 2 0 )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(1 3 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y 1 1 0 = 2 3 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(e )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = 2 + 1 3 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(e )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 0 : 47 .3 Figure4.1wasproducedbythefollowingsnippet. parmar=c0,0,0,0 plotc,6,c,2,type="n",xlab="",ylab="",xaxt="n", yaxt="n",bty="n"

PAGE 283

4.2.RANDOMVECTORS 270 Figure4.1:The X 1 ;X 2 planeandthe Y 1 ;Y 2 plane.Thelightgrayregionsare where ~ X and ~ Y live.Thedarkgrayregionsarewhere j X 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(X 2 j 1 polygonc1,1.9,1.9,1,c,1,1.9,1.9,col=gray.8, border=NA polygonc1,1.2,1.9,1.9,1.7,1,c,1,1.7,1.9,1.9,1.2, col=gray.5,border=NA segments0,1,1.9,1,lwd=3#x1axis segments1,0,1,1.9,lwd=3#x2axis textc,1,c,2,cexpressionboldX[1], expressionboldX[2] polygonc5,5.9,5.9,4,c,1,1.9,1.9,col=gray.8, border=NA polygonc5,5.2,5.2,4.8,4.8,c,1,1.9,1.9,1.2, col=gray.5,border=NA segments4,1,5.9,1,lwd=3#y1axis segments5,0,5,1.9,lwd=3#y2axis textc,5,c,2,cexpressionboldY[1], expressionboldY[2] arrows2.5,1,3.5,1,length=.2,lwd=2

PAGE 284

4.3.REPRESENTINGDISTRIBUTIONS 271 Thepointoftheexample,ofcourse,isthemethod,nottheanswer.Functions ofrandomvariablesandrandomvectorsarecommoninstatisticsandprobability. Therearemanymethodstodealwiththem.Themethodoftransformingthepdf isonethatisoftenuseful. 4.3RepresentingDistributions Weusuallydescribearandomvariable Y through p Y itspmfif Y isdiscreteor itspdfif Y iscontinuous.Butthereareatleasttwoalternatives.First,anyrandom variable Y canbedescribedbyits cumulativedistributionfunction ,orcdf, F Y which isdenedby F Y c P[ Y c ]= P c y = P[ Y = y ] if Y isdiscrete R c p y dy if Y iscontinuous : .4 Equation4.4denesthecdfintermsofthepmforpdf.Itisalsopossibletogo theotherway.If Y iscontinuous,thenforanynumber b 2 R P Y b = F b = Z b p y dy whichshowsbytheFundamentalTheoremofCalculusthat p y = F 0 y .Onthe otherhand,if Y isdiscrete,Then P[ Y = y ]=P[ Y y ] )]TJ/F15 11.9552 Tf 12.647 0 Td [(P[ Y
PAGE 285

4.3.REPRESENTINGDISTRIBUTIONS 272 Figure4.2:pmf's,pdf's,andcdf's

PAGE 286

4.3.REPRESENTINGDISTRIBUTIONS 273 Figure4.2wasproducedbythefollowingsnippet. parmfrow=c,2 y<-seq-1,11,by=1 ploty,dbinomy,10,.7,type="p",ylab="pmf", main="Bin,.7" ploty,pbinomy,10,.7,type="p",pch=16, ylab="cdf",main="Bin,.7" segments-1:10,pbinom-1:10,10,.7, 0:11,pbinom-1:10,10,.7 y<-seq0,5,len=50 ploty,dexpy,1,type="l",ylab="pdf",main="Exp" ploty,pexpy,1,type="l",ylab="cdf",main="Exp" segmentsx0,y0,x1,y1 drawslinesegments.Thelinesegmentsrun from x0,y0 to x1,y1 .Theargumentsmaybevectors. Theotheralternativerepresentationfor Y isits momentgeneratingfunction or mgf M Y .Themomentgeneratingfunctionisdenedas M Y t = E [ e tY ]= P y e ty p Y y if Y isdiscrete R e ty p Y y if Y iscontinuous .6 M Y isalsoknownasthe Laplacetransform of p Y Becausewedenethemgfasasumorintegralthereisthequestionofwhether thesumorintegralisniteandhencewhetherthemgfiswelldened.InEquation4.6,themgfisalwaysdenedat t =0 .SeeExercise11.Butevenif M Y t is notwelldenedtheintegralorsumisnotabsolutelyconvergentforlarge t ,what mattersforstatisticalpracticeiswhether M Y t iswelldenedinaneighborhood of0,i.e.whetherthereexistsa > 0 suchthat M Y t existsfor t 2 )]TJ/F41 11.9552 Tf 9.298 0 Td [(; .The momentgeneratingfunctiongetsitsnamefromthefollowingtheorem. Theorem4.5. If Y hasmgf M Y denedinaneighborhoodof0,then E [ Y n ]= M n Y d n dt n M Y t 0

PAGE 287

4.3.REPRESENTINGDISTRIBUTIONS 274 Proof. Weprovidetheproofforthecase n =1 .Theproofforlargervaluesof n is similar. d dt M Y t 0 = d dt Z e ty p Y y dy 0 = Z d dt e ty 0 p Y y dy = Z ye ty 0 p Y y dy = Z yp Y y dy = E [ Y ] Thesecondlineoftheproofhastheform d dt Z f t;y dy = Z d dt f t;y dy; anequalitywhichisnotnecessarilytrue.Itistruefornicefunctions f ;butestablishingexactlywhatnicemeansrequiresmeasuretheoryandisbeyondthescope ofthisbook.Wewillcontinuetousetheequalitywithoutthoroughjustication. Onecould,ifonewished,calculateandplot M Y t ,thoughthereisusuallylittle pointindoingso.Themainpurposeofmomentgeneratingfunctionsisinproving theoremsandnot,astheirnamemightsuggest,inderivingmoments.Andmgf's areusefulinprovingtheoremsmostlybecauseofthefollowingtworesults. Theorem4.6. Let X and Y betworandomvariableswithmomentgeneratingfunctionsassumedtoexist M X and M Y .If M X t = M Y t forall t insomeneighborhoodof0,then F X = F Y ;i.e., X and Y havethesamedistribution. Theorem4.7. Let Y 1 ;::: beasequenceofrandomvariableswithmomentgenerating functionsassumedtoexist M Y 1 ;::: .Dene M t =lim n !1 M Y n t .Ifthelimit existsforall t inaneighborhoodof0,andif M t isamomentgeneratingfunction, thenthereisauniquecdf F suchthat 1. F y =lim n !1 F Y n y forall y where F iscontinuousand

PAGE 288

4.3.REPRESENTINGDISTRIBUTIONS 275 2. M isthemgfof F Theorems4.6and4.7bothassumethatthenecessarymgf'sexist.Itisinconvenientthatnotalldistributionshavemgf's.Onecanavoidtheproblembyusing characteristicfunctions alsoknownasFouriertransformsinsteadofmomentgeneratingfunctions.Thecharacteristicfunctionisdenedas C Y t = E [ e itY ] where i = p )]TJ/F15 11.9552 Tf 9.299 0 Td [(1 .Alldistributionshavecharacteristicfunctions,andthecharacteristicfunctioncompletelycharacterizesthedistribution,socharacteristicfunctionsareidealforourpurpose.However,dealingwithcomplexnumberspresents itsowninconveniences.Weshallnotpursuethistopicfurther.ProofsofTheorems4.6and4.7andsimilarresultsforcharacteristicfunctionsareomittedbut maybefoundinmoreadvancedbooks. Twomoreusefulresultsaretheorems4.8and4.9. Theorem4.8. Let X bearandomvariable, a;b beconstants,anddene Y = aX + b Then M Y t = e bt M X at Proof. M Y t = E e aX + b t = e bt E e atX = e bt M X at Theorem4.9. Let X and Y beindependentrandomvariables.Dene Z = X + Y Then M Z t = M X t M Y t Proof. M Z t = E e X + Y t = E [ e Xt e Yt ]= E [ e Xt ] E [ e Yt ]= M X t M Y t Corollary4.10. Let Y 1 ;:::;Y n beacollectionofi.i.d.randomvariableseachwith mgf M Y .Dene X = Y 1 + + Y n .Then M X t =[ M Y t ] n

PAGE 289

4.4.EXERCISES 276 4.4Exercises 1.RefertoEquation4.1onpage265. aTohelpvisualizethejointdensity p ~ X ,makeacontourplot.Youwill havetochoosesomevaluesof x 1 ,somevaluesof x 2 ,andthenevaluate p ~ X x 1 ;x 2 onallpairs x 1 ;x 2 andsavethevaluesinamatrix.Finally, passthevaluestothe contour function.Choosevaluesof x 1 and x 2 that helpyouvisualize p ~ X .Youmayhavetochoosevaluesbytrialanderror. bDrawadiagramthatillustrateshowtondtheregion A andthelimits ofintegrationinEquation4.1. cSupplythemissingstepsinEquation4.1.Makesureyouunderstand them.Verifytheanswer. dUse R toverifytheanswertoEquation4.1bysimulation. 2.RefertoExample1.6onpage43ontreeseedlingswhere N isthenumberof Newseedlingsthatemergeinagivenyearand X isthenumberthatsurvive tothenextyear.Find P[ X 1] 3. X 1 ;X 2 haveajointdistributionthatisuniformontheunitcircle.Find p X 1 ;X 2 4.Therandomvector X;Y haspdf p X;Y x;y / ky forsome k> 0 and x;y inthetriangularregionboundedbythepoints ; 0 )]TJ/F15 11.9552 Tf 9.299 0 Td [(1 ; 1 ,and ; 1 aFind k bFind P[ Y 1 = 2] cFind P[ X 0] dFind P[ j X )]TJ/F41 11.9552 Tf 11.955 0 Td [(Y j 1 = 2] 5.Provetheassertiononpage265thatmutualindependenceimpliespairwise independence. aBeginwiththecaseofthreerandomvariables ~ X = X 1 ;X 2 ;X 3 .Prove thatif X 1 ;X 2 ;X 3 aremutuallyindependent,thenanytwoofthemare independent. bGeneralizetothecase ~ X = X 1 ;:::;X n .

PAGE 290

4.4.EXERCISES 277 6.RefertoEquation4.2onpage266.Verifythat X 1 ? X 2 X 1 ? X 3 ,and X 2 ? X 3 butthat X 1 X 2 ,and X 3 arenotmutuallyindependent. 7.ProveLemma4.1 8.FillintheproofofTheorem4.2onpage267. 9. X and Y areuniformlydistributedintherectanglewhosecornersare ; 0 ; 1 )]TJ/F15 11.9552 Tf 9.299 0 Td [(1 ; 0 ,and ; )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 ai.Find p x;y ii.Are X and Y independent? iii.Findthemarginaldensities p x and p y iv.Findtheconditionaldensities p x j y and p y j x v.Find E [ X ] E [ X j Y = : 5] ,and E [ X j Y = )]TJ/F41 11.9552 Tf 9.298 0 Td [(: 5] bLet U = X + Y and V = X )]TJ/F41 11.9552 Tf 11.955 0 Td [(Y i.Findtheregionwhere U and V live. ii.Findthejointdensity p u;v iii.Are U and V independent? iv.Findthemarginaldensities p u and p v v.Findtheconditionaldensities p u j v and p v j u vi.Find E [ U ] E [ U j V = : 5] ,and E [ U j V = )]TJ/F41 11.9552 Tf 9.298 0 Td [(: 5] 10.Lettherandomvector U;V bedistributeduniformlyontheunitsquare.Let X = UV and Y = U=V aDrawtheregionofthe X Y planewheretherandomvector X;Y lives. bFindthejointdensityof X;Y cFindthemarginaldensityof X dFindthemarginaldensityof Y eFind P[ Y> 1] fFind P[ X> 1] gFind P[ Y> 1 = 2] hFind P[ X> 1 = 2] iFind P[ XY> 1] .

PAGE 291

4.4.EXERCISES 278 jFind P[ XY> 1 = 2] 11.JustbelowEquation4.6isthestatementthemgfisalwaysdenedat t =0 . Foranyrandomvariable Y ,nd M Y 12.ProvidetheproofofTheorem4.5forthecase n =2 13.RefertoTheorem4.9.Whereintheproofistheassumption X ? Y used?

PAGE 292

C HAPTER 5 S PECIAL D ISTRIBUTIONS Statisticiansoftenmakeuseofstandard parametricfamilies ofprobabilitydistributions.Aparametricfamilyisacollectionofprobabilitydistributionsdistinguished by,orindexedby,a parameter .AnexampleistheBinomialdistributionintroduced inSection1.3.1.Therewere N trials.Eachhadaprobability ofsuccess.Usually isunknownandcouldbeanynumberin ; 1 .ThereisoneBin N; distribution foreachvalueof ; isaparameter;thesetofprobabilitydistributions f Bin N; : 2 ; 1 g isaparametricfamilyofdistributions. Wehavealreadyseenfourparametricfamiliesthe BinomialSection1.3.1,PoissonSection1.3.2,ExponentialSection1.3.3,and NormalSection1.3.4distributions.Chapter5examinestheseinmoredetailand introducesseveralothers. 5.1TheBinomialandNegativeBinomialDistributions TheBinomialDistribution Statisticiansoftendealwithsituationsinwhichthere isacollectionof trials performedunderidenticalcircumstances;eachtrialresults ineither success or failure .TypicalexamplesarecoinipsHeadsorTails,medical trialscureornot,voterpollsDemocratorRepublican,basketballfreethrows makeormiss.ConditionsfortheBinomialDistributionare 1.thenumberoftrials n isxedinadvance, 279

PAGE 293

5.1.BINOMIALANDNEGATIVEBINOMIAL 280 2.theprobabilityofsuccess isthesameforeachtrial,and 3.trialsareconditionallyindependentofeachother,given Lettherandomvariable X bethenumberofsuccessesinsuchacollectionoftrials. Then X issaidtohavetheBinomialdistributionwithparameters n; ,written X Bin n; .Thepossiblevaluesof X aretheintegers0,1,..., n .Figure1.5 showsexamplesofBinomialpmf'sforseveralcombinationsof n and .Usually isunknownandthetrialsareperformedinordertolearnabout Obviously,largevaluesof X areevidencethat islargeandsmallvaluesof X areevidencethat issmall.Buttoevaluatetheevidencequantitativelywemust beabletosaymore.Inparticular,onceaparticularvalue X = x hasbeenobserved wewanttoquantifyhowwellitisexplainedbydifferentpossiblevaluesof .That is,wewanttoknow p x j Theorem5.1. If X Bin n; then p X x = n x x )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F42 7.9701 Tf 6.586 0 Td [(x for x =0 ; 1 ;:::;n Proof. Whenthe n trialsofaBinomialexperimentarecarriedouttherewillbe asequenceofsuccesses'sandfailures'ssuchas 1000110 100 .Let S = f 0 ; 1 g n bethesetofsuchsequencesand,foreach x 2f 0 ; 1 ;:::;n g ,let S x bethe subsetof S consistingofsequenceswith x 1 'sand n )]TJ/F41 11.9552 Tf 12.059 0 Td [(x 0 's.If s 2 S x then Pr s = x )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F42 7.9701 Tf 6.587 0 Td [(x .Inparticular,all s 'sin S x havethesameprobability.Therefore, p X x =P X = x =P S x = sizeof S x )]TJ/F41 11.9552 Tf 5.48 -9.683 Td [( x )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F42 7.9701 Tf 6.587 0 Td [(x = n x x )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F42 7.9701 Tf 6.586 0 Td [(x Thespecialcase n =1 isimportantenoughtohaveitsownname.When n =1 then X issaidtohavea Bernoulli distributionwithparameter .Wewrite X Bern .If X Bern then p X x = x )]TJ/F41 11.9552 Tf 10.923 0 Td [( 1 )]TJ/F42 7.9701 Tf 6.586 0 Td [(x for x 2f 0 ; 1 g .Experiments thathavetwopossibleoutcomesarecalledBernoullitrials. Suppose X 1 Bin n 1 ; X 2 Bin n 2 ; and X 1 ? X 2 .Let X 3 = X 1 + X 2 Whatisthedistributionof X 3 ?Logicsuggeststheansweris X 3 Bin n 1 + n 2 ;

PAGE 294

5.1.BINOMIALANDNEGATIVEBINOMIAL 281 becausethereare n 1 + n 2 trials,thetrialsallhavethesameprobabilityof success ,thetrialsareindependentofeachotherthereasonforthe X 1 ? X 2 assumptionand X 3 isthetotalnumberofsuccesses.Theorem5.3showsa formalproofofthisproposition.Butrstweneedtoknowthemomentgenerating function. Theorem5.2. Let X Bin n; .Then M X t = e t + )]TJ/F41 11.9552 Tf 11.955 0 Td [( n Proof. Let Y Bern .Then M Y t = E [ e tY ]= e t + )]TJ/F41 11.9552 Tf 11.955 0 Td [( : Nowlet X = P n i =1 Y i wherethe Y i 'sarei.i.d.Bern andapplyCorollary4.10. Theorem5.3. Suppose X 1 Bin n 1 ; ; X 2 Bin n 1 ; ;and X 1 ? X 2 .Let X 3 = X 1 + X 2 .Then X 3 Bin n 1 + n 2 ; Proof. M X 3 t = M X 1 t M X 2 t = e t + )]TJ/F41 11.9552 Tf 11.955 0 Td [( n 1 e t + )]TJ/F41 11.9552 Tf 11.955 0 Td [( n 2 = e t + )]TJ/F41 11.9552 Tf 11.955 0 Td [( n 1 + n 2 TherstequalityisbyTheorem4.9;thesecondisbyTheorem5.2.Werecognize thelastexpressionasthemgfoftheBin n 1 + n 2 ; distribution.Sotheresult followsbyTheorem4.6. ThemeanoftheBinomialdistributionwascalculatedinEquation1.11.Theorem5.4restatesthatresultandgivesthevarianceandstandarddeviation. Theorem5.4. Let X Bin n; .Then 1. E [ X ]= n 2. Var X = n )]TJ/F41 11.9552 Tf 11.955 0 Td [( 3. SD X = p n )]TJ/F41 11.9552 Tf 11.956 0 Td [( .

PAGE 295

5.1.BINOMIALANDNEGATIVEBINOMIAL 282 Proof. Theprooffor E [ X ] wasgivenearlier.If X Bin n; ,then X = P n i =1 X i where X i Bern andthe X i 'saremutuallyindependent.Therefore,byTheorem1.9, Var X = n Var X i .But Var X i = E X 2 i )]TJ/F52 11.9552 Tf 11.955 0 Td [(E X i 2 = )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 = )]TJ/F41 11.9552 Tf 11.955 0 Td [( : So Var X = n )]TJ/F41 11.9552 Tf 11.955 0 Td [( .Theresultfor SD X followsimmediately. Exercise1asksyoutoproveTheorem5.4bymomentgeneratingfunctions. R comeswithbuilt-infunctionsforworkingwithBinomialdistributions.Youcan getthefollowinginformationbytyping helpdbinom helppbinom helpqbinom or helprbinom .Therearesimilarfunctionsforworkingwithotherdistributions, butwewon'trepeattheirhelppageshere. Usage: dbinomx,size,prob,log=FALSE pbinomq,size,prob,lower.tail=TRUE,log.p=FALSE qbinomp,size,prob,lower.tail=TRUE,log.p=FALSE rbinomn,size,prob Arguments: x,q:vectorofquantiles. p:vectorofprobabilities. n:numberofobservations.If`lengthn>1',thelengthis takentobethenumberrequired. size:numberoftrials. prob:probabilityofsuccessoneachtrial. log,log.p:logical;ifTRUE,probabilitiesparegivenaslogp. lower.tail:logical;ifTRUEdefault,probabilitiesareP[X<=x], otherwise,P[X>x]. Details:

PAGE 296

5.1.BINOMIALANDNEGATIVEBINOMIAL 283 Thebinomialdistributionwith`size'=nand`prob'=phas density px=choosen,xp^x-p^n-x forx=0,...,n. Ifanelementof`x'isnotinteger,theresultof`dbinom'is zero,withawarning.pxiscomputedusingLoader'salgorithm, seethereferencebelow. Thequantileisdefinedasthesmallestvaluexsuchthat Fx>=p,whereFisthedistributionfunction. Value: `dbinom'givesthedensity,`pbinom'givesthedistribution function,`qbinom'givesthequantilefunctionand`rbinom' generatesrandomdeviates. If`size'isnotaninteger,`NaN'isreturned. References: CatherineLoader.FastandAccurateComputationof BinomialProbabilities;manuscriptavailablefrom SeeAlso: `dnbinom'forthenegativebinomial,and`dpois'forthePoisson distribution. Examples: #ComputeP45
PAGE 297

5.1.BINOMIALANDNEGATIVEBINOMIAL 284 ##Using"log=TRUE"foranextendedrange: n<-2000 k<-seq,n,by=20 plotk,dbinomk,n,pi/10,log=TRUE,type='l',ylab="logdensity", main="dbinom*,log=TRUEisbetterthanlogdbinom*" linesk,logdbinomk,n,pi/10,col='red',lwd=2 ##extremepointsareomittedsincedbinomgives0. mtext"dbinomk,log=TRUE",adj=0 mtext"extendedrange",adj=0,line=-1,font=4 mtext"logdbinomk",col="red",adj=1 Figure5.1showstheBinomialpmfforseveralvaluesof x n ,and p .Notethat foraxed p ,as n getslargerthepmflooksincreasinglylikeaNormalpdf.That's theCentralLimitTheorem.Let Y 1 ;:::;Y n i.i.d.Bern p .Thenthedistribution of X isthesameasthedistributionof P Y i andtheCentralLimitTheoremtellsus that P Y i looksincreasinglyNormalas n !1 Also,foraxed n ,thepmflooksmoreNormalwhen p = : 5 thanwhen p = : 05 Andthat'sbecauseconvergenceundertheCentralLimitTheoremisfasterwhen thedistributionofeach Y i ismoresymmetric. Figure5.1wasproducedby parmfrow=c,2 n<-5 p<-.05 x<-0:5 plotx,dbinomx,n,p,ylab="px",main="n=5,p=.05" ... TheNegativeBinomialDistribution Ratherthanxinadvancethenumberof trials,experimenterswillsometimescontinuethesequenceoftrialsuntilaprespeciednumberofsuccesses r hasbeenachieved.Inthiscasethetotalnumberof failures N istherandomvariableandissaidtohavetheNegativeBinomialdistributionwithparameters r; ,written N NegBin r; .Warning:someauthors saythatthetotalnumberoftrials, N + r ,hastheNegativeBinomialdistribution. Oneexampleisagamblerwhodecidestoplaythedailylotteryuntilshewins.

PAGE 298

5.1.BINOMIALANDNEGATIVEBINOMIAL 285 Figure5.1:TheBinomialpmf

PAGE 299

5.1.BINOMIALANDNEGATIVEBINOMIAL 286 Theprespeciednumberofsuccessesis r =1 .Thenumberoffailures N untilshe winsisrandom.Inthiscase,andwhenever r =1 N issaidtohaveaGeometric distributionwithparameter ;wewrite N Geo .Often, isunknown.Large valuesof N areevidencethat issmall;smallvaluesof N areevidencethat is large.Theprobabilityfunctionis p N k =P N = k =P r )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 successesintherst k + r )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 trials and k + r 'thtrialisasuccess = k + r )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 r )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 r )]TJ/F41 11.9552 Tf 11.956 0 Td [( k for k =0 ; 1 ;::: Let N 1 NegBin r 1 ; ,..., N t NegBin r t ; ,and N 1 ,..., N t beindependent ofeachother.Thenonecanimagineasequenceoftrialsoflength P N i + r i having P r i successes. N 1 isthenumberoffailuresbeforethe r 1 'thsuccess;...; N 1 + + N t isthenumberoffailuresbeforethe r 1 + + r t 'thsuccess.Itisevident that N P N i isthenumberoffailuresbeforethe r P r i 'thsuccessoccursand thereforethat N NegBin r; Theorem5.5. If Y NegBin r; then E [ Y ]= r )]TJ/F41 11.9552 Tf 10.758 0 Td [( = and Var Y = r )]TJ/F41 11.9552 Tf 10.757 0 Td [( = 2 Proof. Itsufcestoprovetheresultfor r =1 .Thentheresultfor r> 1 willfollow

PAGE 300

5.1.BINOMIALANDNEGATIVEBINOMIAL 287 bytheforegoingargumentandTheorems1.7and1.9.For r =1 E [ N ]= 1 X n =0 n P[ N = n ] = 1 X n =1 n )]TJ/F41 11.9552 Tf 11.955 0 Td [( n = )]TJ/F41 11.9552 Tf 11.955 0 Td [( 1 X n =1 n )]TJ/F41 11.9552 Tf 11.956 0 Td [( n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 = )]TJ/F41 11.9552 Tf 9.299 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( 1 X n =1 d d )]TJ/F41 11.9552 Tf 11.955 0 Td [( n = )]TJ/F41 11.9552 Tf 9.298 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( d d 1 X n =1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( n = )]TJ/F41 11.9552 Tf 9.298 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( d d 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( = )]TJ/F41 11.9552 Tf 9.298 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F15 11.9552 Tf 9.299 0 Td [(1 2 = 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( Thetrickofwritingeachtermasaderivative,thenswitchingtheorderofsumma-

PAGE 301

5.1.BINOMIALANDNEGATIVEBINOMIAL 288 tionandderivativeisoccasionallyuseful.Hereitisagain. E N 2 = 1 X n =0 n 2 P[ N = n ] = )]TJ/F41 11.9552 Tf 11.956 0 Td [( 1 X n =1 n n )]TJ/F15 11.9552 Tf 11.956 0 Td [(1+ n )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 = )]TJ/F41 11.9552 Tf 11.956 0 Td [( 1 X n =1 n )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 + )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 1 X n =1 n n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( n )]TJ/F39 7.9701 Tf 6.587 0 Td [(2 = 1 )]TJ/F41 11.9552 Tf 11.956 0 Td [( + )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 1 X n =1 d 2 2 )]TJ/F41 11.9552 Tf 11.956 0 Td [( n = 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( + )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 d 2 2 1 X n =1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( n = 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( + )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 d 2 2 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( = 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( +2 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 3 = 2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(3 + 2 2 Therefore, Var N = E [ N 2 ] )]TJ/F15 11.9552 Tf 11.955 0 Td [( E [ N ] 2 = 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 : The R functionsforworkingwiththenegativeBinomialdistributionare dnbinom pnbinom qnbinom ,and rnbinom .Figure5.2displaystheNegativeBinomialpdfand illustratestheuseof qnbinom Figure5.2wasproducedwiththefollowingsnippet. r<-c1,5,30 p<-c.1,.5,.8 parmfrow=c,3 foriinseqalong=r forjinseqalong=p{ lo<-qnbinom.01,r[i],p[j]

PAGE 302

5.1.BINOMIALANDNEGATIVEBINOMIAL 289 Figure5.2:TheNegativeBinomialpmf

PAGE 303

5.2.MULTINOMIAL 290 hi<-qnbinom.99,r[i],p[j] x<-lo:hi plotx,dnbinomx,r[i],p[j],ylab="probability",xlab="N", main=substitutelistr==a,theta==b, lista=i,b=j } lo and hi arethelimitsonthex-axisofeachplot.Theuseof qbinom ensures thateachplotshowsatleast98%ofitsdistribution. 5.2TheMultinomialDistribution Themultinomialdistributiongeneralizesthebinomialdistributioninthefollowing way.Thebinomialdistributionapplieswhentheoutcomeofatrialhastwopossible values;themultinomialdistributionapplieswhentheoutcomeofatrialhasmore thantwopossibleoutcomes.Someexamplesare ClinicalTrials Inclinicaltrials,eachpatientisadministeredatreatment,usually anexperimentaltreatmentorastandard,controltreatment.Later,eachpatientmaybescoredaseithersuccess,failure,orcensored.Censoringoccurs becausepatientsdon'tshowupfortheirappointments,moveaway,orcan't befoundforsomeotherreason. Craps Afterthecome-outroll,eachsuccessiverolliseitherawin,loss,orneither. Genetics Eachgenecomesinseveralvariants.Everypersonhastwocopiesofthe gene,onematernalandonepaternal.Sotheperson'sstatuscanbedescribed byapairlike f a;c g meaningthatshehasonecopyoftype a andonecopy oftype c .Thepairiscalledtheperson's genotype .Eachpersoninasample canbeconsideredatrial.Geneticistsmaycounthowmanypeoplehaveeach genotype. PoliticalScience Inanelection,eachpersonpreferseithertheRepublicancandidate,theDemocrat,theGreen,orisundecided. Inthiscasewecountthenumberofoutcomesofeachtype.Ifthereare k possible outcomesthentheresultisavector y 1 ;:::;y k where y i isthenumberoftimesthat outcome i occuredand y 1 + + y k = n isthenumberoftrials.

PAGE 304

5.2.MULTINOMIAL 291 Let p p 1 ;:::;p k betheprobabilitiesofthe k categoriesand n bethenumber oftrials.Wewrite Y Mult n;p .Inparticular, Y Y 1 ;:::;Y k isavectorof length k .Because Y isavector,soisitsexpectation E [ Y ]= = 1 ;:::; k E [ Y 1 ] ;:::; E [ Y k ]= np 1 ;:::;np k : The i 'thcoordinate, Y i ,isarandomvariableinitsownright.Because Y i counts thenumberoftimesoutcome i occurredin n trials,itsdistributionis Y i Bin n;p i : SeeExercise19..1 Althoughthe Y i 'sareallBinomial,theyarenotindependent.Afterall,if Y 1 = n then Y 2 = = Y k =0 ,sothe Y i 'smustbedependent.Whatistheirjointpmf? Whatistheconditionaldistributionof,say, Y 2 ;:::;Y k given Y 1 ?Thenexttwo theoremsprovidetheanswers. Theorem5.6. If Y Mult n;p then f Y y 1 ;:::;y k = n y 1 y k p y 1 1 p y k k where )]TJ/F42 7.9701 Tf 14.987 -4.379 Td [(n y 1 y k isthe multinomialcoefcient n y 1 y k = n Q y i Proof. Whenthe n trialsofamultinomialexperimentarecarriedout,therewill beasequenceofoutcomessuchas abkdbg f ,wherethelettersindicatetheoutcomesofindividualtrials.Onesuchsequenceis a a | {z } y 1 times b b | {z } y 2 times k k | {z } y k times Theprobabilityofthisparticularsequenceis Q p y i i .Everysequencewith y 1 a 's,..., y k k 'shasthesameprobability.So f Y y 1 ;:::;y k = numberofsuchsequences Y p y i i = n y 1 y k Y p y i i :

PAGE 305

5.3.POISSON 292 Theorem5.7. If Y Mult n;p then Y 2 ;:::;Y k j Y 1 = y 1 Mult n )]TJ/F41 11.9552 Tf 11.955 0 Td [(y 1 ; p 2 ;:::;p k where p i = p i = )]TJ/F41 11.9552 Tf 11.955 0 Td [(p 1 for i =2 ;:::;k Proof. SeeExercise18. R 'sfunctionsforthemultinomialdistributionare rmultinom and dmultinom rmultinomm,n,p drawsasampleofsize m p isavectorofprobabilities.The resultisa k m matrix.Eachcolumnisonedraw,soeachcolumnsumsto n .The userdoesnotspecify k ;itisdeterminedby k=lengthp 5.3ThePoissonDistribution ThePoissondistributionisusedtomodelcountsinthefollowingsituation. Thereisadomainofstudy,usuallyablockofspaceortime. Eventsariseatseeminglyrandomlocationsinthedomain. Thereisanunderlyingrateatwhicheventsarise. Theratedoesnotvaryoverthedomain. Theoccurenceofaneventatanylocation ` 1 isindependentoftheoccurence ofaneventatanyotherlocation ` 2 Let y bethetotalnumberofeventsthatariseinthedomain. Y hasaPoisson distributionwithrateparameter ,written Y Poi .Thepmfis p Y y = e )]TJ/F42 7.9701 Tf 6.586 0 Td [( y y for y =0 ; 1 ;::: ThemeanwasderivedinChapter1,Exercise18a.Itis E [ Y ]= : Theorem5.8. Let Y Poi .Then M Y t = e e t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1

PAGE 306

5.3.POISSON 293 Proof. M Y t = E [ e tY ]= 1 X y =0 e ty p Y y = 1 X y =0 e ty e )]TJ/F42 7.9701 Tf 6.586 0 Td [( y y = 1 X y =0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [( e t y y = e )]TJ/F42 7.9701 Tf 6.587 0 Td [( e )]TJ/F42 7.9701 Tf 6.586 0 Td [(e t 1 X y =0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(e t e t y y = e e t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 Theorem5.9. Let Y Poi .Then Var Y = Proof. Justforfun!wewillprovethetheoremtwowaysrstdirectlyand thenwithmomentgeneratingfunctions. Proof1. E [ Y 2 ]= 1 X y =0 y 2 e )]TJ/F42 7.9701 Tf 6.586 0 Td [( y y = 1 X y =0 y y )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.587 0 Td [( y y + 1 X y =0 y e )]TJ/F42 7.9701 Tf 6.587 0 Td [( y y = 1 X y =2 y y )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.587 0 Td [( y y + = 1 X z =0 e )]TJ/F42 7.9701 Tf 6.586 0 Td [( z +2 z + = 2 + So Var Y = E [ Y 2 ] )]TJ/F15 11.9552 Tf 11.955 0 Td [( E [ Y ] 2 = .

PAGE 307

5.3.POISSON 294 Proof2. E [ Y 2 ]= d 2 dt 2 M Y t t =0 = d 2 dt 2 e e t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t =0 = d dt e t e e t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t =0 = h e t e e t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 + 2 e 2 t e e t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 i t =0 = + 2 So Var Y = E [ Y 2 ] )]TJ/F15 11.9552 Tf 11.955 0 Td [( E [ Y ] 2 = Theorem5.10. Let Y i Poi i for i =1 ;:::;n andletthe Y i sbemutuallyindependent.Let Y = P n 1 Y i and = P n 1 i .Then Y Poi Proof. UsingTheorems4.9and5.8wehave M Y t = Y M Y i t = Y e i e t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 = e e t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 whichisthemgfofthePoi distribution. Suppose,for i =1 ;:::;n Y i isthenumberofeventsoccurringonadomain D i ; Y i Poi i .Supposethe D i 'saredisjointandthe Y i 'sareindependent.Let Y = P Y i bethenumberofeventsarisingon D = [ D i .Thelogicofthesituation suggeststhat Y Poi where = P i .Theorem5.10assuresusthateverythingworkscorrectly;that Y doesindeedhavethePoi distribution.Another waytoputit:If Y Poi ,andiftheindividualeventsthat Y countsarerandomlydividedintotwotypes Y 1 and Y 2 accordingtoabinomialdistributuionwith parameter ,then Y 1 Poi and Y 2 Poi )]TJ/F41 11.9552 Tf 11.955 0 Td [( and Y 1 ? Y 2 Figure5.3showsthePoissonpmffor =1 ; 4 ; 16 ; 64 .As increasesthepmf looksincreasinglyNormal.That'saconsequenceofTheorem5.10andtheCentral LimitTheorem.When Y Poi Theorem5.10tellsuswecanthinkof Y as Y = P i =1 Y i whereeach Y i Poi mustbeanintegerforthistobeprecise. ThentheCentralLimitTheoremtellsusthat Y willbeapproximatelyNormalwhen islarge.

PAGE 308

5.3.POISSON 295 Figure5.3:Poissonpmffor =1 ; 4 ; 16 ; 64

PAGE 309

5.3.POISSON 296 Figure5.3wasproducedwiththefollowingsnippet. y<-0:7 ploty,dpoisy,1,xlab="y",ylab=expressionp[Y]y, main=expressionlambda==1 y<-0:10 ploty,dpoisy,4,xlab="y",ylab=expressionp[Y]y, main=expressionlambda==4 y<-6:26 ploty,dpoisy,16,xlab="y",ylab=expressionp[Y]y, main=expressionlambda==16 y<-44:84 ploty,dpoisy,64,xlab="y",ylab=expressionp[Y]y, main=expressionlambda==64 OneoftheearlyusesofthePoissondistributionwasin Theprobabilityvariations inthedistributionof particles byRutherfordandGeiger1910.An particleisa Heliumnucleus,ortwoprotonsandtwoneutrons. Example5.1 RutherfordandGeiger Thephenomenonofradioactivitywasbeginningtobeunderstoodintheearly 20 th century. Intheir1910article,RutherfordandGeigerwrite Incountingthe particlesemittedfromradioactivesubstances...[it]isof importancetosettlewhether...variationsindistributionareinagreement withthelawsofprobability, i.e. whetherthedistributionof particlesonan averageisthattobeanticipatedifthe particlesareexpelledatrandomboth inregardtospaceandtime.Itmightbeconceived,forexample,thatthe emissionofan particlemightprecipitatethedisintegrationofneighbouring atoms,andsoleadtoadistributionof particlesatvariancewiththesimple probabilitylaw. SoRutherfordandGeigeraregoingtodothreethingsintheirarticle.They'regoingto count particleemissionsfromsomeradioactivesubstance;they'regoingtoderivethe distributionof particleemissionsaccordingtotheory;andthey'regoingtocompare theactualandtheoreticaldistributions. Heretheydescribetheirexperimentalsetup.

PAGE 310

5.3.POISSON 297 Thesourceofradiationwasasmalldiskcoatedwithpolonium,whichwas placedinsideanexhaustedtube,closedatoneendbyazincsulphidescreen. Thescintillationswerecountedintheusualway...thenumberofscintillations...correspondingto1/8minuteintervalswerecounted.... Thefollowingexampleisanillustrationoftheresultobtained.Thenumbers, giveninthehorizontallines,correspondtothenumberofscintillationsfor successiveintervalsof7.5seconds. Totalperminute. 1stminute:37442320.....25 2nd52543542....30 3rd54133152....24 4th82223426....31 5th742645104....42 Averagefor5minutes...30.4 Trueaverage...........31.0 Andheretheydescribetheirtheoreticalresult. Thedistributionof particlesaccordingtothelawofprobabilitywaskindly workedoutforusbyMr.Bateman.Themathematicaltheoryisappendedas anotetothispaper.Mr.Batemanhasshownthatif x bethetrueaverage numberofparticlesforanygivenintervalfallingonthescreenfromaconstant source,theprobabilitythat n particlesareobservedinthesameintervalis givenby x n n e )]TJ/F42 7.9701 Tf 6.587 0 Td [(x n ishereawholenumber,whichmayhaveallpositivevalues from0to 1 .Thevalueof x isdeterminedbycountingalargenumberof scintillationsanddividingbythenumberofintervalsinvolved.Theprobability for n particlesinthegivenintervalcanthenatoncebecalculatedfrom thetheory. RefertoBateman[1910]forhisderivation.Table5.1showstheirdata.AsRutherford andGeigerexplain: Forconveniencethetapewasmeasuredupinfourparts,theresultsofwhich aregivenseparatelyinhorizontalcolumnsI.toIV. ForexampleseecolumnI.,outof792intervalsof1/8minute,inwhich 3179 particleswerecounted,thenumberofintervals3 particleswas152. Combiningthefourcolumns,itisseenthatoutof2608intervalscontaining

PAGE 311

5.3.POISSON 298 10,097particles,thenumberoftimesthat3 particleswereobservedwas 525.Thenumbercalculatedfromtheequationwasthesame,viz.525. Finally,howdidRutherfordandGeigercomparetheiractualandtheoreticaldistributions? Theydiditwithaplot,whichwereproduceasFigure5.4.Theirconclusion: Itwillbeseenthat,onthewhole,theoryandexperimentareinexcellent accord....Wemayconsequentlyconcludethatthedistributionof particles intimeisinagreementwiththelawsofprobabilityandthatthe particles areemittedatrandom....Apartfromtheirbearingonradioactiveproblems, theseresultsareofinterestasanexampleofamethodoftestingthelawsof probabilitybyobservingthevariationsinquantitiesinvolvedinaspontaneous materialprocess. Example5.2 neurobiology ThisexamplecontinuesExample2.6.Wewouldliketoknowwhetherthisneuronrespondsdierentlytodierenttastantsand,ifso,how.Tothatend,we'llseehowoften theneuronresinashortperiodoftimeafterreceivingatastantandwe'llcompare theresultsfordierenttastants.Specically,we'llcountthenumberofspikesinthe 150millisecondsmsec=.15simmediatelyfollowingthedeliveryofeachtastant. msecisabouttherateatwhichratscanlickandisthoughtbyneurobiologiststo beabouttherightintervaloftime.Let Y ij bethenumberofspikesinthe150msec followingthe j 'thdeliveryoftastant i .Becausewe'recountingthenumberofeventsin axedperiodoftimewe'lladoptaPoissonmodel: Y ij Poi i where i istheaverageringrateofthisneurontotastant i Webeginbymakingalisttoholdthedata.Thereshouldbeoneelementforeach tastant.Thatelementshouldbeavectorwhoselengthisthenumberoftimesthat tastantwasdelivered.Hereisthe R codetodoit.RefertoExample2.6forreadingin thedata. nspikes<-list MSG100=repNA,lengthtastants$MSG100, MSG300=repNA,lengthtastants$MSG300, NaCl100=repNA,lengthtastants$NaCl100, NaCl300=repNA,lengthtastants$NaCl300, water=repNA,lengthtastants$water

PAGE 312

5.3.POISSON 299 Number of particles 01234567891011121314 Number of particles Number ofintervals Average number I........ 15561061521701228850171230010 3179 792 4.01 II....... 1739881161209863374941000 2334 596 3.92 III...... 15569713911896602618331000 2373 632 3.75 IV...... 1052921181249262266302001 2211 588 3.76 Sum.... 572033835255324082731394527104011 10097 2608 3.87 Theoretical values 542104075255083942541406829114141 Table5.1:RutherfordandGeiger'sdata

PAGE 313

5.3.POISSON 300 Figure5.4:RutherfordandGeiger'sFigure1comparingtheoreticalsolidlineto actualopencirclesdistributionof particlecounts.

PAGE 314

5.3.POISSON 301 Nowwellineachelementbycountingthenumberofneuronringsinthetime interval. foriinseqalong=nspikes forjinseqalong=nspikes[[i]] nspikes[[i]][j]<-sumspikes[[8]]>tastants[[i]][j] &spikes[[8]]<=tastants[[i]][j]+.15 Nowwecanseehowmanytimestheneuronredaftereachdeliveryof,say,MSG100by typing nspikes$MSG100 Figure5.5comparesthevetastantsgraphically.PanelAisastripchart.Ithasve tickmarksonthe x -axisforthevetastants.Aboveeachtickmarkisacollectionof circles.Eachcirclerepresentsonedeliveryofthetastantandshowshowmanytimes theneuronredinthe150msecfollowingthatdelivery.PanelBshowsmuchthesame informationinamosaicplot.Theheightsoftheboxesshowhowoftenthattastant produced0,1,...,5spikes.Thewidthofeachcolumnshowshowoftenthattastant wasdelivered.PanelCshowsmuchthesameinformationinyetadierentway.Ithas onelineforeachtastant;thatlineshowshowoftentheneuronrespondedwith0,1, ...,5spikes.PanelDcompareslikelihoodfunctions.Thevecurvesarethelikelihood functionsfor 1 ;:::; 5 Theredoesnotseemtobemuchdierenceintheresponseofthisneurontodierent tastants.Althoughwecancomputethem.l.e. ^ i 'swith lapplynspikes,mean andndthattheyrangefromalowof ^ 3 0 : 08 for.1MNaCltoahighof ^ 1 0 : 4 for.1MMSG,panelDsuggeststheplausibilityof 1 = = 5 : 2 Figure5.5wasproducedwiththefollowingsnippet. spiketable<-matrixNA,lengthnspikes,6, dimnames=listtastant=1:5, counts=0:5 foriinseqalong=nspikes spiketable[i,]<-histnspikes[[i]],seq-.5,5.5,by=1, plot=F$counts

PAGE 315

5.3.POISSON 302 Figure5.5:Numbersofringsofaneuronin150msecaftervedifferenttastants. Tastants: 1=MSG.1M;2=MSG.3M;3=NaCl.1M;4=NaCl.3M;5=water. Panels: A:Astripchart.Eachcirclerepresentsonedeliveryofatastant.B:Amosaicplot.C:Eachlinerepresentsonetastant.D:Likelihoodfunctions.Eachline representsonetastant.

PAGE 316

5.4.UNIFORM 303 freqtable<-applyspiketable,1,functionxx/sumx Theline spiketable<-... createsamatrixtoholdthedataandillustrates theuseof dimnames tonamethedimensions.Someplottingcommandsusethose namesforlabellingaxes. Theline spiketable[i,]<-... showsaninterestinguseofthe hist command.Insteadofplottingahistogramitcansimplyreturnthecounts. Theline freqtable<-... divideseachrowofthematrixbyitssum,turning countsintoproportions. Butlet'sinvestigatealittlefurther.DothedatareallyfollowaPoissondistribution? Figure5.6showsthePoi : 2 distributionwhilethecirclesshowtheactualfractionsof rings.Thereisapparentlygoodagreement.Butnumbersclosetozerocanbedeceiving. The R command dpois0:5,.2 revealsthattheprobabilityofgetting5spikesis lessthan0.00001,assuming 0 : 2 .Soeitherthe i 'sarenotallapproximately.2, neuronspikingdoesnotreallyfollowaPoissondistribution,orwehavewitnessedavery unusualevent. Figure5.6wasproducedwiththefollowingsnippet. matplot0:5,freqtable,pch=1,col=1, xlab="numberoffirings",ylab="fraction" lines0:5,dpois0:5,0.2 5.4TheUniformDistribution TheDiscreteUniformDistribution Thediscreteuniformdistributionisthedistributionthatgivesequalweighttoeachinteger 1 ;:::;n .Wewrite Y U ;n Thepmfis p y =1 =n .2 for y =1 ;:::;n .Thediscreteuniformdistributionisusedtomodel,forexample, dicerolls,oranyotherexperimentinwhichtheoutcomesaredeemedequally

PAGE 317

5.4.UNIFORM 304 Figure5.6:ThelineshowsPoissonprobabilitiesfor =0 : 2 ;thecirclesshowthe fractionoftimestheneuronrespondedwith0,1,...,5spikesforeachoftheve tastants.

PAGE 318

5.5.GAMMA,EXPONENTIAL,CHISQUARE 305 likely.Theonlyparameteris n .Itisnotanespeciallyusefuldistributioninpractical workbutcanbeusedtoillustrateconceptsinasimplesetting.Foranapplied exampleseeExercise22. TheContinuousUniformDistribution Thecontinuousuniformdistributionis thedistributionwhosepdfisatovertheinterval [ a;b ] .Wewrite Y U a;b Althoughthenotationmightbeconfusedwiththediscreteuniform,thecontext willindicatewhichismeant.Thepdfis p y =1 = b )]TJ/F41 11.9552 Tf 11.955 0 Td [(a for y 2 [ a;b ] .Themean,variance,andmomentgeneratingfunctionareleftas Exercise23. Supposeweobservearandomsample y 1 ;:::;y n fromU a;b .Whatisthe m.l.e. ^ a; ^ b ?Thejointdensityis p y 1 ;:::;y n = )]TJ/F39 7.9701 Tf 11.911 -4.976 Td [(1 b )]TJ/F42 7.9701 Tf 6.586 0 Td [(a n if a y and b y n 0 otherwise whichismaximized,asafunctionof a;b ,if b )]TJ/F41 11.9552 Tf 12.219 0 Td [(a isassmallaspossiblewithout makingthejointdensity0.Thus, ^ a = y and ^ b = y n 5.5TheGamma,Exponential,andChiSquareDistributions )]TJ/F31 11.9552 Tf 7.314 0 Td [(istheuppercaseGreekletterGamma.The gammafunction isaspecialmathematicalfunctiondenedon R + as \050 = Z 1 0 t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(t dt Informationaboutthegammafunctioncanbefoundinmathematicstextsand referencebooks.Forourpurposes,thekeyfactsare: \050 +1= \050 for > 0 \050 n = n )]TJ/F15 11.9552 Tf 11.956 0 Td [(1! forpositiveintegers n \0501 = 2= p

PAGE 319

5.5.GAMMA,EXPONENTIAL,CHISQUARE 306 Foranypositivenumbers and ,theGamma ; distributionhaspdf p y = 1 \050 y )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y= for y 0 .3 Wewrite Y Gam ; Figure5.7showsGammadensitiesforfourvaluesof andfourvaluesof IneachpanelofFigure5.7thecurvesfordifferent 'shavedifferentshapes. Sometimes iscalledthe shape parameteroftheGammadistribution. Thefourpanelslookidenticalexceptfortheaxes.I.e.,thefourcurveswith = : 5 ,onefromeachpanel,havethesameshapebutdifferentscales.The differentscalescorrespondtodifferentvaluesof .Forthisreason iscalled a scale parameter.OnecanseedirectlyfromEquation5.3that isascale parameterbecause p y dependson y onlythroughtheratio y= .Theideaof scaleparameterisembodiedinTheorem5.11.SeeSection8.6formoreon scaleparameters. Figure5.7wasproducedbythefollowingsnippet. parmfrow=c,2 shape<-c.5,1,2,4 scale<-c.5,1,2,4 leg<-expressionalpha==.5,alpha==1, alpha==2,alpha==4 foriinseqalong=scale{ ymax<-scale[i]*maxshape+3*sqrtmaxshape*scale[i] y<-seq0,ymax,length=100 den<-NULL forshinshape den<-cbindden,dgammay,shape=sh,scale=scale[i] matploty,den,type="l",main=letters[i],ylab="py" legendymax*.1,maxden[den!=Inf],legend=leg }

PAGE 320

5.5.GAMMA,EXPONENTIAL,CHISQUARE 307 Figure5.7:Gammadensitiesforvariousvaluesof and .

PAGE 321

5.5.GAMMA,EXPONENTIAL,CHISQUARE 308 Theorem5.11. Let X Gam ; andlet Y = cX .Then Y Gam ;c Proof. UseTheorem1.1. p X x = 1 \050 x )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(x= ; Since Y = cX x = y=c and dx=dy =1 =c ,so p Y y = 1 c \050 y=c )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y=c = 1 \050 c y )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y=c whichistheGam ;c density.AlsoseeExercise9. Themean,mgf,andvariancearerecordedinthenextseveraltheorems. Theorem5.12. Let Y Gam ; Then E [ Y ]= Proof. E [ Y ]= Z 1 0 y 1 \050 y )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y= dy = \050 +1 \050 Z 1 0 1 \050 +1 +1 y e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y= dy = : Thelastequalityfollowsbecause \050 +1= \050 ,andtheintegrandisa Gammadensitysotheintegralis1.AlsoseeExercise9. Thelasttrickintheproofrecognizinganintegrandasadensityandconcludingthattheintegralis1isveryuseful.Hereitisagain. Theorem5.13. Let Y Gam ; .Thenthemomentgeneratingfunctionis M Y t = )]TJ/F41 11.9552 Tf 11.955 0 Td [(t )]TJ/F42 7.9701 Tf 6.586 0 Td [( for t< 1 = Proof. M Y t = Z 1 0 e ty 1 \050 y )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y= dy = 1 )]TJ/F42 7.9701 Tf 6.586 0 Td [(t Z 1 0 1 \050 1 )]TJ/F42 7.9701 Tf 6.586 0 Td [(t y )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y 1 )]TJ/F43 5.9776 Tf 5.756 0 Td [(t dy = )]TJ/F41 11.9552 Tf 11.955 0 Td [(t )]TJ/F42 7.9701 Tf 6.586 0 Td [(

PAGE 322

5.5.GAMMA,EXPONENTIAL,CHISQUARE 309 Theorem5.14. Let Y Gam ; .Then Var Y = 2 and SD Y = p : Proof. SeeExercise10. TheExponentialDistribution Weoftenhavetodealwithsituationssuchas thelifetimeofanitem thetimeuntilaspeciedeventhappens Themostfundamentalprobabilitydistributionforsuchsituationsisthe exponential distribution.Let Y bethetimeuntiltheitemdiesortheeventoccurs.If Y hasanexponentialdistributionthenforsome > 0 thepdfof Y is p Y y = )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y= for y 0 andwewrite Y Exp .ThisdensityispicturedinFigure5.8arepeatof Figure1.7forfourvaluesof .Theexponentialdistributionisthespecialcase oftheGammadistributionwhen =1 .Themean, SD ,andmgfaregivenby Theorems5.125.14. Eachexponentialdensityhasitsmaximumat y =0 anddecreasesmonotonically.Thevalueof determinesthevalue p Y j andtherateofdecrease.Usually isunknown.Smallvaluesof y areevidenceforlargevaluesof ;largevaluesof y areevidenceforsmallvaluesof Example5.3 RadioactiveDecay Itiswellknownthatsomechemicalelementsareradioactive.Everyatomofaradioactive elementwilleventuallydecayintosmallercomponents.E.g.,uranium-238byfarthe mostabundanturaniumisotope, 238 Udecaysintothorium-234andan particlewhile plutonium-239theisotopeusedinnuclearweapons, 239 Pudecaysintouranium-235 235 Uandan particle. See http://www.epa.gov/radiation/radionuclides formoreinformation. Thetime Y atwhichaparticularatomdecaysisarandomvariablethathasan exponentialdistribution.Eachradioactiveisotopehasitsowndistinctivevalueof .A radioactiveisotopeisusuallycharacterizedbyitsmedianlifetime,or half-life ,insteadof

PAGE 323

5.5.GAMMA,EXPONENTIAL,CHISQUARE 310 Figure5.8:Exponentialdensities

PAGE 324

5.5.GAMMA,EXPONENTIAL,CHISQUARE 311 .Thehalf-lifeisthevalue m whichsatises P[ Y m ]=P[ Y m ]=0 : 5 .The half-life m canbefoundbysolving Z m 0 )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y= dy =0 : 5 : Theansweris m = log2 .YouwillbeaskedtoverifythisclaiminExercise29. Uranium-238hasahalf-lifeof4.47billionyears.Thusits isabout6.45billion. Plutonium-239hasahalf-lifeof24,100years.Thusits isabout35,000. Exponentialdistributionshaveaninterestinganduniquememorylessproperty. Todemonstrate,weexaminetheExp distributionasamodelfor T ,theamount oftimeacomputerHelplinecallerspendsonhold.Supposethecallerhasalready spent t minutesonhold;i.e., T t .Let S betheremaingtimeonhold;i.e., S = T )]TJ/F41 11.9552 Tf 11.955 0 Td [(t .Whatisthedistributionof S given T>t ?Foranynumber r> 0 P[ S>r j T t ]=P[ T t + r j T t ]= P[ T t + r;T t ] P[ T t ] = P[ T t + r ] P[ T t ] = )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F39 7.9701 Tf 6.586 0 Td [( t + r = )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(t= = e )]TJ/F42 7.9701 Tf 6.587 0 Td [(r= : Inotherwords, S hasanExp distributionWhy?thatdoesnotdependon thecurrentlyelapsedtime t Why?.ThisisauniquepropertyoftheExponential distribution;noothercontinuousdistributionhasit.Whetheritmakessensefor theamountoftimeonholdisaquestionthatcouldbeveriedbylookingatdata. Ifit'snotsensible,thenExp isnotanaccuratemodelfor T Example5.4 Somedataherethatdon'tlookexponential ThePoissonprocess ThereisacloserelationshipbetweenExponential,Gamma, andPoissondistributions.Forillustrationconsideracompany'scustomercallcenter.Supposethatcallsarriveaccordingtoarate suchthat 1.inatimeintervaloflength T ,thenumberofcallsisarandomvariablewith distributionPoi T and 2.if I 1 and I 2 aredisjointtimeintervalsthenthenumberofcallsin I 1 isindependentofthenumberofcallsin I 2 .

PAGE 325

5.5.GAMMA,EXPONENTIAL,CHISQUARE 312 Whencallsarriveinthiswaywesaythecallsfollowa Poissonprocess Supposewestartmonitoringcallsattime t 0 .Let T 1 bethetimeoftherst callafter t 0 and Y 1 = T 1 )]TJ/F41 11.9552 Tf 12.519 0 Td [(t 0 ,thetimeuntiltherstcall. T 1 and Y 1 arerandom variables.Whatisthedistributionof Y 1 ?Foranypositivenumber y Pr[ Y 1 >y ]=Pr[ nocallsin [ t 0 ;t 0 + y ]]= e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y wherethesecondequalityfollowsbythePoissonassumption.But Pr[ Y 1 >y ]= e )]TJ/F42 7.9701 Tf 6.587 0 Td [(y Pr[ Y y ]=1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y p Y y = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y Y 1 Exp = Whataboutthetimetothesecondcall?Let T 2 bethetimeofthesecondcallafter t 0 and Y 2 = T 2 )]TJ/F41 11.9552 Tf 11.956 0 Td [(t 0 .Whatisthedistributionof Y 2 ?Forany y> 0 Pr[ Y 2 >y ]=Pr[ fewerthan2callsin [ t 0 ;y ]] =Pr[ 0callsin [ t 0 ;y ]]+Pr[ 1callin [ t 0 ;y ]] = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y + ye )]TJ/F42 7.9701 Tf 6.586 0 Td [(y andtherefore p Y 2 y = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y )]TJ/F41 11.9552 Tf 11.955 0 Td [(e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y + y 2 e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y = 2 \0502 ye )]TJ/F42 7.9701 Tf 6.587 0 Td [(y so Y 2 Gam ; 1 = Ingeneral,thetime Y n untilthen'thcallhastheGam n; 1 = distribution.This factisanexampleofthefollowingtheorem. Theorem5.15. Let Y 1 ;:::;Y n bemutuallyindependentandlet Y i Gam i ; Then Y X Y i Gam ; where P i Proof. SeeExercise30. InTheorem5.15notethatthe Y i 'smustallhavethesame eventhoughthey mayhavedifferent i 's. Poisson-GammaconjugacyF=Gam/Gam TheChi-squaredDistribution TheGammadistributionwith =2 and = p= 2 where p isapositiveintegeriscalledthe chi-squareddistribution with p degreesof freedom.Wewrite Y 2 p Theorem5.16. Let Y 1 ;:::;Y n i.i.d.N ; 1 .Dene X = P Y 2 i .Then X 2 n Proof. ThistheoremwillbeprovedinSection5.7.

PAGE 326

5.6.BETA 313 5.6TheBetaDistribution Forpositivenumbers and ,theBeta ; distributionisadistributionfora randomvariable Y ontheunitinterval.Thedensityis p Y y = \050 + \050 \050 y )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(y )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 for y 2 [0 ; 1] Theparametersare ; .Wewrite Y Be ; .Themeanandvarianceare givenbyTheorem5.17. Theorem5.17. Let Y Be ; .Then E [ Y ]= + Var Y = + 2 + +1 Proof. SeeExercise26. Figure5.9showssomeBetadensities.Eachpanelshowsfourdensitieshaving thesamemean.ItisevidentfromtheFigureandthedenitionthattheparameter controlswhetherthedensityrisesorfallsattheleftright.Ifboth > 1 and > 1 then p y isunimodal.TheBe ; 1 isthesameastheU ; 1 distribution. TheBetadistributionarisesasthedistributionoforderstatisticsfromtheU ; 1 distribution.Let x 1 ;:::;x n i.i.d.U ; 1 .Whatisthedistributionof x ,therst orderstatistic?Ourstrategyisrsttondthecdfof x ,thendifferentiatetoget thepdf. F X x =P[ X x ] =1 )]TJ/F15 11.9552 Tf 11.955 0 Td [(P[ all X i 'saregreaterthan x ] =1 )]TJ/F15 11.9552 Tf 11.955 0 Td [( )]TJ/F41 11.9552 Tf 11.955 0 Td [(x n Therefore, p X x = d dx F X x = n )]TJ/F41 11.9552 Tf 11.955 0 Td [(x n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = \050 n +1 \0501\050 n )]TJ/F41 11.9552 Tf 11.955 0 Td [(x n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 whichistheBe ;n density.Forthedistributionofthelargestorderstatisticsee Exercise27.

PAGE 327

5.6.BETA 314 Figure5.9:Betadensities a :Betadensitieswithmean.2; b :Betadensitieswith mean.5; c :Betadensitieswithmean.9;

PAGE 328

5.6.BETA 315 Figure5.9wasproducedbythefollowing R snippet. parmfrow=c,1 y<-seq0,1,length=100 mean<-c.2,.5,.9 alpha<-c.3,1,3,10 foriin1:3{ beta<-alpha-mean[i]*alpha/mean[i] den<-NULL forjin1:lengthbeta den<-cbindden,dbetay,alpha[j],beta[j] matploty,den,type="l",main=letters[i],ylab="py" ifi==1 legend.6,8,paste"a,b=",roundalpha,2,",", roundbeta,2,"",sep="",lty=1:4 elseifi==2 legend.1,4,paste"a,b=",roundalpha,2,",", roundbeta,2,"",sep="",lty=1:4 elseifi==3 legend.1,10,paste"a,b=",roundalpha,2,",", roundbeta,2,"",sep="",lty=1:4 TheBetadensityiscloselyrelatedtotheGammadensitybythefollowingtheorem. Theorem5.18. Let X 1 Gam 1 ; ; X 2 Gam 2 ; ;and X 1 ? X 2 .Then Y X 1 X 1 + X 2 Be 1 ; 2 Proof. SeeExercise31. NotethatTheorem5.18requires X 1 and X 2 bothtohavethesamevalueof buttheresultdoesn'tdependonwhatthatvalueis.

PAGE 329

5.7.NORMAL 316 5.7TheNormalandRelatedDistributions 5.7.1TheUnivariateNormalDistribution ThehistogramsinFigure1.12onpage32 areapproximatelyunimodal, areapproximatelysymmetric, havedifferentmeans,and havedifferentstandarddeviations. Datawiththesepropertiesisubiquitousinnature.Statisticiansandotherscientists oftenhavetomodelsimilarlookingdata.Onecommonprobabilitydensityfor modellingsuchdataistheNormaldensity,alsoknownasthe Gaussiandensity TheNormaldensityisalsoimportantbecauseoftheCentralLimitTheorem. Forsomeconstants 2 R and > 0 ,theNormaldensityis p x j ; = 1 p 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 x )]TJ/F43 5.9776 Tf 5.756 0 Td [( 2 : .4 Example5.5 Oceantemperatures,continued ToseeaNormaldensityinmoredetail,Figure5.10reproducesthetoprighthistogram fromFigure1.12redrawnwiththeNormaldensityoverlaid,forthevalues 8 : 08 and 0 : 94 .Theverticalaxisisdrawnonthedensityscale.Thereare112temperature measurementsthatgointothishistogram;theywererecordedbetween1949and1997; theirlatitudesareallbetween 44 and 46 ;theirlongitudesareallbetween )]TJ/F15 11.9552 Tf 9.298 0 Td [(21 and )]TJ/F15 11.9552 Tf 9.299 0 Td [(19 Figure5.10wasproducedbythefollowing R snippet. good<-absmed.1000$lon-lons[3]<1& absmed.1000$lat-lats[1]<1 temps<-med.1000$temp[good] histtemps,xlim=c,12,breaks=seq,12,by=.5, freq=F,xlab="temperature",ylab="density",main="" mu<-meantemps sig<-sqrtvartemps x<-seq4,12,length=60

PAGE 330

5.7.NORMAL 317 Figure5.10:Watertemperatures Cat1000mdepth, 44 )]TJ/F15 11.9552 Tf 13.06 0 Td [(46 Nlatitudeand 19 )]TJ/F15 11.9552 Tf 11.955 0 Td [(21 Wlongitude.ThedashedcurveistheN.08,0.94density. linesx,dnormx,mu,sig,lty=2 Visually,theNormaldensityappearstotthedatawell.Randomlychoosingoneof the112historicaltemperaturemeasurements,ormakinganewmeasurementnear 45 N and 20 Watarandomlychosentimearelikedrawingarandomvariable t fromthe N.08,0.94distribution. Lookattemperaturesbetween 8 : 5 and 9 : 0 C.TheN.08,0.94densitysaysthe probabilitythatarandomlydrawntemperature t isbetween 8 : 5 and 9 : 0 Cis P[ t 2 : 5 ; 9 : 0]]= Z 9 : 0 8 : 5 1 p 2 0 : 94 e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 t )]TJ/F40 5.9776 Tf 5.756 0 Td [(8 : 08 0 : 94 2 dt 0 : 16 : .5 TheintegralinEquation5.5isbestdoneonacomputer,notbyhand.In R itcanbe donewith pnorm.0,8.08,.94-pnorm.5,8.08,.94 .Afancierwaytodoit is diffpnormc.5,9,8.08,.94 When x isavector, pnormx,mean,sd returnsavectorof pnorm 's.

PAGE 331

5.7.NORMAL 318 When x isavector, diffx returnsthevectorofdierences x[2]-x[1], x[3]-x[2],...,x[n]-x[n-1] Infact,19ofthe112temperaturesfellintothatbin,and 19 = 112 0 : 17 ,sothe N.08,0.94densityseemstotverywell. However,theN.08,0.94densitydoesn'ttaswellfortemperaturesbetween 7 : 5 and 8 : 0 C. P[ t 2 : 5 ; 8 : 0]]= Z 8 : 0 7 : 5 1 p 2 0 : 94 e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 t )]TJ/F40 5.9776 Tf 5.757 0 Td [(8 : 08 0 : 94 2 dt 0 : 20 : Infact,15ofthe112temperaturesfellintothatbin;and 15 = 112 0 : 13 .Evenso,the N.08,0.94densitytsthedatasetverywell. Theorem5.19. Let Y N ; .Then M Y t = e 2 t 2 2 + t : Proof. M Y t = Z e ty 1 p 2 e )]TJ/F40 5.9776 Tf 12.106 3.259 Td [(1 2 2 y )]TJ/F42 7.9701 Tf 6.587 0 Td [( 2 dy = Z 1 p 2 e )]TJ/F40 5.9776 Tf 12.106 3.258 Td [(1 2 2 y 2 )]TJ/F39 7.9701 Tf 6.586 0 Td [( )]TJ/F39 7.9701 Tf 6.587 0 Td [(2 2 t y + 2 dy = e )]TJ/F43 5.9776 Tf 9.615 3.693 Td [( 2 2 2 Z 1 p 2 e )]TJ/F40 5.9776 Tf 12.106 3.258 Td [(1 2 2 y )]TJ/F39 7.9701 Tf 6.587 0 Td [( + 2 t 2 + + 2 t 2 2 2 dy = e 2 2 t + 4 t 2 2 2 = e 2 t 2 2 + t : ThetechniqueusedintheproofofTheorem5.19isworthremembering,solet's lookatitmoreabstractly.Apartfrommultiplicativeconstants,therstintegralin theproofis Z e ty e )]TJ/F40 5.9776 Tf 12.106 3.259 Td [(1 2 2 y )]TJ/F42 7.9701 Tf 6.587 0 Td [( 2 dy = Z e )]TJ/F40 5.9776 Tf 12.106 3.259 Td [(1 2 2 y )]TJ/F42 7.9701 Tf 6.586 0 Td [( 2 + ty dy Theexponentisquadraticin y andtherefore,forsomevaluesof a b c d e ,and f canbewritten )]TJ/F15 11.9552 Tf 16.401 8.088 Td [(1 2 2 y )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 + ty = ay 2 + by + c = )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 2 y )]TJ/F41 11.9552 Tf 11.955 0 Td [(d e 2 + f

PAGE 332

5.7.NORMAL 319 ThislastexpressionhastheformofaNormaldistributionwithmean d andSD e .Sotheintegralcanbeevaluatedbyputtingitinthisformandmanipulating theconstantssoitbecomestheintegralofapdfandthereforeequalto1.It'sa techniquethatisoftenusefulwhenworkingwithintegralsarisingfromNormal distributions. Theorem5.20. Let Y N ; .Then E [ Y ]= and Var Y = 2 : Proof. Forthemean, E [ Y ]= M 0 Y = t 2 + e 2 t 2 2 + t t =0 = : Forthevariance, E [ Y 2 ]= M 00 Y = 2 e 2 t 2 2 + t + t 2 + 2 e 2 t 2 2 + t t =0 = 2 + 2 : So, Var Y = E [ Y 2 ] )]TJ/F52 11.9552 Tf 11.955 0 Td [(E [ Y ] 2 = 2 : TheN ; 1 distributioniscalledthe standardNormaldistribution .AsTheorem5.21shows,allNormaldistributionsarejustshifted,rescaledversionsofthe standardNormaldistribution.Themeanisalocationparameter;thestandard deviationisascaleparameter.SeeSection8.6. Theorem5.21. 1.If X N ; 1 and Y = X + then Y N ; 2.If Y N ; and X = Y )]TJ/F41 11.9552 Tf 11.955 0 Td [( = then X N ; 1 Proof. 1.Let X N ; 1 and Y = X + .ByTheorem4.8 M Y t = e t M X t = e t e 2 t 2 2 2.Let Y N ; and X = Y )]TJ/F41 11.9552 Tf 11.955 0 Td [( = .Then M X t = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(t= M Y t= = e )]TJ/F42 7.9701 Tf 6.586 0 Td [(t= e 2 t= 2 2 + t= = e t 2 2

PAGE 333

5.7.NORMAL 320 Section5.5introducedthe 2 distribution,notingthatitisaspecialcaseofthe Gammadistribution.The 2 distributionarisesinpracticeasasumofsquaresof standardNormals.HerewerestateTheorem5.16,thenproveit. Theorem5.22. Let Y 1 ;:::;Y n i.i.d.N ; 1 .Dene X = P Y 2 i .Then X 2 n Proof. Startwiththecase n =1 M X t = E [ e tY 2 1 ]= Z e ty 2 1 p 2 e )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(y 2 2 dy = Z 1 p 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 )]TJ/F39 7.9701 Tf 6.587 0 Td [(2 t y 2 dy = )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = 2 Z p 1 )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 t p 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 )]TJ/F39 7.9701 Tf 6.587 0 Td [(2 t y 2 dy = )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = 2 So X Gam = 2 ; 2= 2 1 If n> 1 thenbyCorollary4.10 M X t = M Y 2 1 + + Y 2 n t = )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 t )]TJ/F42 7.9701 Tf 6.586 0 Td [(n= 2 So X Gam n= 2 ; 2= 2 n 5.7.2TheMultivariateNormalDistribution Let ~ X bean n -dimensionalrandomvectorwithmean ~ X andcovariancematrix ~ X .Wesaythat ~ X hasamultivariateNormaldistributionifitsjointdensityis p ~ X ~x = 1 n= 2 j j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 ~x )]TJ/F42 7.9701 Tf 6.586 0 Td [( ~ X t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~x )]TJ/F42 7.9701 Tf 6.587 0 Td [( ~ X .6 where j j referstothedeterminantofthematrix .Wewrite ~ X N ; .ComparisonofEquations5.4page316and5.6showsthatthelatterisageneralizationoftheformer.Themultivariateversionhasthecovariancematrix inplace ofthescalarvariance 2 TobecomemorefamiliarwiththemultivariateNormaldistribution,webegin withthecasewherethecovariancematrixisdiagonal: = 0 B B B @ 2 1 00 0 2 2 0 00 . . . . . . 2 n 1 C C C A

PAGE 334

5.7.NORMAL 321 Inthiscasethejointdensityis p ~ X ~x = 1 n= 2 j j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 ~x )]TJ/F42 7.9701 Tf 6.587 0 Td [( ~ X t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~x )]TJ/F42 7.9701 Tf 6.587 0 Td [( ~ X = 1 p 2 n n Y i =1 1 i e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 P n i =1 x i )]TJ/F43 5.9776 Tf 5.756 0 Td [( i 2 2 i = n Y i =1 1 p 2 i e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 x i )]TJ/F43 5.9776 Tf 5.756 0 Td [( i i 2 ; theproductof n separateonedimensionalNormaldensities,oneforeachdimension.Thereforethe X i 'sareindependentandNormallydistributed,with X i N i ; i .AlsoseeExercise32. When 1 = = n =1 ,then isthe n -dimensionalidentitymatrix I n .When, inaddition, 1 = = n =0 ,then ~ X N ;I n and ~ X issaidtohavethe standard n -dimensionalNormaldistribution Note:fortwoarbitraryrandomvariables X 1 and X 2 X 1 ? X 2 implies Cov X 1 ;X 2 = 0 ;but Cov X 1 ;X 2 =0 doesnotimply X 1 ? X 2 .However,if X 1 and X 2 arejointly Normallydistributedthentheimplicationistrue.I.e.if X 1 ;X 2 N ; and Cov X 1 ;X 2 =0 ,then X 1 ? X 2 .Infact,somethingstrongeristrue,asrecordedin thenexttheorem. Theorem5.23. Let ~ X = X 1 ;:::;X n N ; where hastheso-calledblockdiagonalform = 0 B B B @ 11 0 12 0 1 m 0 21 22 0 2 m . . . . . . 0 m 1 0 mm )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 mm 1 C C C A where ii isan n i n i matrix, 0 ij isan n i n j matrixof0'sand P m 1 n i = n .Partition ~ X toconformwith anddene ~ Y i 's: ~ Y 1 = X 1 ;:::;X n 1 ~ Y 2 = X n 1 +1 ;:::;X n 1 + n 2 ..., ~ Y m = X n 1 + + n m )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 +1 ;:::;X n m and i 's: 1 = 1 ;:::; n 1 2 = n 1 +1 ;:::; n 1 + n 2 ..., m = n 1 + + n m )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 +1 ;:::; n m .Then 1.The ~ Y i 'sareindependentofeachother,and 2. ~ Y i N i ; ii

PAGE 335

5.7.NORMAL 322 Proof. Thetransformation ~ X ~ Y 1 ;:::; ~ Y m isjusttheidentitytransformation,so p ~ Y 1 ;:::; ~ Y m ~y 1 ;:::;~y m = p ~ X ~y 1 ;:::;~y m = 1 n= 2 j j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 ~y )]TJ/F42 7.9701 Tf 6.586 0 Td [( t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [( = 1 n= 2 Q m i =1 j ii j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 P m i =1 ~y i )]TJ/F42 7.9701 Tf 6.587 0 Td [( i t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ii ~y i )]TJ/F42 7.9701 Tf 6.586 0 Td [( i = m Y i =1 1 n i = 2 j ii j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 ~y i )]TJ/F42 7.9701 Tf 6.587 0 Td [( i t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ii ~y i )]TJ/F42 7.9701 Tf 6.587 0 Td [( i TolearnmoreaboutthemultivariateNormaldensity,lookatthecurveson which p ~ X isconstant;i.e., f ~x : p ~ X ~x = c g forsomeconstant c .Thedensitydepends onthe x i 'sthroughthequadraticform ~x )]TJ/F41 11.9552 Tf 11.673 0 Td [( t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ~x )]TJ/F41 11.9552 Tf 11.673 0 Td [( ,so p ~ X isconstantwhere thisquadraticformisconstant.Butwhen isdiagonal, ~x )]TJ/F41 11.9552 Tf 12.897 0 Td [( t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ~x )]TJ/F41 11.9552 Tf 12.897 0 Td [( = P n 1 x i )]TJ/F41 11.9552 Tf 11.311 0 Td [( i 2 = 2 i so p ~ X ~x = c istheequationofanellipsoidcenteredat andwith eccentricitiesdeterminedbytheratios i = j Whatdoesthisdensitylooklike?Itiseasiesttoanswerthatquestionintwo dimensions.Figure5.11showsthreebivariateNormaldensities.Theleft-handcolumnshowscontourplotsofthebivariatedensities;theright-handcolumnshows samplesfromthejointdistributions.Inallcases, E [ X 1 ]= E [ X 2 ]=0 .Inthe toprow, X 1 = X 2 =1 ;inthesecondrow, X 1 =1; X 2 =2 ;inthethirdrow, X 1 =1 = 2; X 2 =2 .Thestandarddeviationisascaleparameter,sochangingthe SDjustchangesthescaleoftherandomvariable.That'swhatgivesthesecond andthirdrowsmoreverticalspreadthantherst,andmakesthethirdrowmore horizontallysquashedthantherstandsecond. Figure5.11wasproducedwiththefollowing R code. parmfrow=c,2#a3by2arrayofplots x1<-seq-5,5,length=60 x2<-seq-5,5,length=60 den.1<-dnormx1,0,1 den.2<-dnormx2,0,1 den.jt<-den.1%o%den.2 contourx1,x2,den.jt,xlim=c-5,5,ylim=c-5,5,main="a",

PAGE 336

5.7.NORMAL 323 Figure5.11:BivariateNormaldensity. E [ X 1 ]= E [ X 2 ]=0 a,b : X 1 = X 2 =1 c,d : X 1 =1; X 2 =2 e,f : X 1 =1 = 2; X 2 =2 a,c,e :contoursofthejointdensity. b,d,f :samplesfromthejointdensity.

PAGE 337

5.7.NORMAL 324 xlab=expressionx[1],ylab=expressionx[2] samp.1<-rnorm300,0,1 samp.2<-rnorm300,0,1 plotsamp.1,samp.2,xlim=c-5,5,ylim=c-5,5,main="b", xlab=expressionx[1],ylab=expressionx[2],pch="." den.2<-dnormx2,0,2 den.jt<-den.1%o%den.2 contourx1,x2,den.jt,xlim=c-5,5,ylim=c-5,5,main="c", xlab=expressionx[1],ylab=expressionx[2], samp.2<-rnorm300,0,2 plotsamp.1,samp.2,xlim=c-5,5,ylim=c-5,5,main="d", xlab=expressionx[1],ylab=expressionx[2],pch="." den.1<-dnormx1,0,.5 den.jt<-den.1%o%den.2 contourx1,x2,den.jt,xlim=c-5,5,ylim=c-5,5,main="e", xlab=expressionx[1],ylab=expressionx[2] samp.1<-rnorm300,0,.5 plotsamp.1,samp.2,xlim=c-5,5,ylim=c-5,5,main="f", xlab=expressionx[1],ylab=expressionx[2],pch="." Thecodemakesheavyuseofthefactthat X 1 and X 2 areindependentfora calculatingthejointdensityandbdrawingrandomsamples. den.1%o%den.2 yieldsthe outerproduct of den.1 and den.2 .Itisamatrix whose ij 'thentryis den.1[i]*den.2[j] Nowlet'sseewhathappenswhen isnotdiagonal.Let Y N ~ Y ; ~ Y ,so p ~ Y ~y = 1 n= 2 j ~ Y j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.258 Td [(1 2 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [( ~ Y t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~ Y ~y )]TJ/F42 7.9701 Tf 6.586 0 Td [( ~ Y ; andlet ~ X N ;I n ~ X isjustacollectionofindependentN ; 1 randomvariables.Itscurvesofconstantdensityarejust n )]TJ/F15 11.9552 Tf 9.959 0 Td [(1 -dimensionalspherescenteredat

PAGE 338

5.7.NORMAL 325 theorigin.Dene ~ Z = 1 = 2 ~ X + .Wewillshowthat p ~ Z = p ~ Y ,thereforethat ~ Z and ~ Y havethesamedistribution,andthereforethatanymultivariateNormalrandom vectorhasthesamedistributionasalineartransformationofastandardmultivariateNormalrandomvector.Toshow p ~ Z = p ~ Y weapplyTheorem4.4.TheJacobian ofthetransformationfrom ~ X to ~ Z is j j 1 = 2 ,thesquarerootofthedeterminantof .Therefore, p ~ Z ~y = p ~ X )]TJ/F15 11.9552 Tf 5.479 -9.683 Td [( )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = 2 ~y )]TJ/F41 11.9552 Tf 11.956 0 Td [( j j )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = 2 = 1 p 2 n e )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = 2 )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 = 2 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [( t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 = 2 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [( j j )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = 2 = 1 n= 2 j j 1 = 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [( t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [( = p ~ Y ~y TheprecedingresultsaysthatanymultivariateNormalrandomvariable, ~ Y inour notationabove,hasthesamedistributionasalineartransformationofastandard Normalrandomvariable. ToseewhatmultivariateNormaldensitieslooklikeitiseasiesttolookat 2dimensions.Figure5.12showsthreebivariateNormaldensities.Thelefthandcolumnshowscontourplotsofthebivariatedensities;theright-handcolumn showssamplesfromthejointdistributions.Inallcases, E [ X 1 ]= E [ X 2 ]=0 and 1 = 2 =1 .Inthetoprow, 1 ; 2 =0 ;inthesecondrow, 1 ; 2 = : 5 ;inthethirdrow, 1 ; 2 = )]TJ/F41 11.9552 Tf 9.299 0 Td [(: 8 Figure5.12wasproducedwiththefollowing R code. parmfrow=c,2#a3by2arrayofplots npts<-60 sampsize<-300 x1<-seq-5,5,length=npts x2<-seq-5,5,length=npts Sigma<-arrayNA,c3,2,2 Sigma[1,,]<-c,0,0,1 Sigma[2,,]<-c,.5,.5,1 Sigma[3,,]<-c,-.8,-.8,1

PAGE 339

5.7.NORMAL 326 Figure5.12:BivariateNormaldensity. E [ X 1 ]= E [ X 2 ]=0 ; 1 = 2 =1 a,b : 1 ; 2 =0 c,d : 1 ; 2 = : 5 e,f : 1 ; 2 = )]TJ/F41 11.9552 Tf 9.299 0 Td [(: 8 a,c,e :contoursofthejointdensity. b,d,f :samplesfromthejointdensity.

PAGE 340

5.7.NORMAL 327 den.jt<-matrixNA,npts,npts foriin1:3 Sig<-Sigma[i,,] Siginv<-solveSig#matrixinverse forjin1:npts forkin1:npts x<-cx1[j],x2[k] den.jt[j,k]<-1/sqrt*pi*detSig* exp-.5*tx%*%Siginv%*%x contourx1,x2,den.jt,xlim=c-5,5,ylim=c-5,5, drawlabels=F, xlab=expressionx[1], ylab=expressionx[2],main=letters[2*i-1] samp<-matrixrnorm2*sampsize,2,sampsize samp<-Sig%*%samp plotsamp[1,],samp[2,],pch=".", xlim=c-5,5,ylim=c-5,5, xlab=expressionx[1], ylab=expressionx[2],main=letters[2*i] WeconcludethissectionwithsometheoremsaboutNormalrandomvariables thatwillproveusefullater. Theorem5.24. Let ~ X N ; bean n -dimensionalNormalrandomvariable;let A beafullrank n by n matrix;andlet Y = AX .Then Y N A;A A t .

PAGE 341

5.7.NORMAL 328 Proof. ByTheorem4.4pg.268, p ~ Y ~y = p ~ X A )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ~y j A )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 j = 1 n= 2 j A jj j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 A )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~y )]TJ/F42 7.9701 Tf 6.586 0 Td [( ~ X t )]TJ/F40 5.9776 Tf 5.757 0 Td [(1 A )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~y )]TJ/F42 7.9701 Tf 6.586 0 Td [( ~ X = 1 n= 2 j A jj j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 A )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~y )]TJ/F42 7.9701 Tf 6.586 0 Td [(A ~ X t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 A )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [(A ~ X = 1 n= 2 j A jj j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [(A ~ X t A )]TJ/F40 5.9776 Tf 5.757 0 Td [(1 t )]TJ/F40 5.9776 Tf 5.757 0 Td [(1 A )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~y )]TJ/F42 7.9701 Tf 6.586 0 Td [(A ~ X = 1 n= 2 j A A t j 1 2 e )]TJ/F40 5.9776 Tf 7.782 3.259 Td [(1 2 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [(A ~ X t A A t )]TJ/F40 5.9776 Tf 5.756 0 Td [(1 ~y )]TJ/F42 7.9701 Tf 6.587 0 Td [(A ~ X whichwerecognizeastheN A;A A t density. Corollary5.25. Let ~ X N ; bean n -dimensionalNormalrandomvariable;let A beafullrank n by n matrix;let b beavectoroflength n ;andlet Y = AX + b Then Y N A + b;A A t Proof. SeeExercise33. Corollary5.26. Let X 1 ;:::;X n i.i.d.N ; .Dene S 2 P n i =1 X i )]TJ/F15 11.9552 Tf 15.508 3.022 Td [( X 2 .Then X ? S 2 Proof. Denetherandomvector ~ Y = Y 1 ;:::;Y n t by Y 1 = X 1 )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X Y 2 = X 2 )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X . Y n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 = X n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X Y n = X Theprooffollowsthesesteps. 1. S 2 isafunctiononlyof Y 1 ;:::;Y n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t ;i.e.notafunctionof Y n 2. Y 1 ;:::;Y n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t ? Y n 3.Therefore S 2 ? Y n .

PAGE 342

5.8. T AND F 329 1. P n i =1 X i )]TJ/F15 11.9552 Tf 15.331 3.022 Td [( X =0 .Therefore, X n )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X = )]TJ/F46 11.9552 Tf 11.291 8.966 Td [(P n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 i =1 X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X .Andtherefore S 2 = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X i =1 X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X 2 + n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X i =1 X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X 2 = n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X i =1 Y 2 i + n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 X i =1 Y i 2 isafunctionof Y 1 ;:::;Y n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t 2. ~ Y = 0 B B B B B @ 1 )]TJ/F39 7.9701 Tf 13.603 4.707 Td [(1 n )]TJ/F39 7.9701 Tf 10.946 4.707 Td [(1 n )]TJ/F39 7.9701 Tf 10.946 4.707 Td [(1 n )]TJ/F39 7.9701 Tf 41.879 4.707 Td [(1 n )]TJ/F39 7.9701 Tf 10.946 4.708 Td [(1 n 1 )]TJ/F39 7.9701 Tf 13.603 4.708 Td [(1 n )]TJ/F39 7.9701 Tf 10.946 4.708 Td [(1 n )]TJ/F39 7.9701 Tf 41.879 4.708 Td [(1 n . . . . . . . )]TJ/F39 7.9701 Tf 10.946 4.707 Td [(1 n )]TJ/F39 7.9701 Tf 10.946 4.707 Td [(1 n 1 )]TJ/F39 7.9701 Tf 13.603 4.707 Td [(1 n )]TJ/F39 7.9701 Tf 10.946 4.707 Td [(1 n 1 n 1 n 1 n 1 n 1 C C C C C A ~ X A ~ X wherethematrix A isdenedbytheprecedingequation,so ~ Y N A; 2 AA t .Therst n )]TJ/F15 11.9552 Tf 12.093 0 Td [(1 rowsof A areeachorthogonaltothelast row.Therefore AA t = 11 ~ 0 ~ 0 t 1 =n where 11 hasdimension n )]TJ/F15 11.9552 Tf 12.225 0 Td [(1 n )]TJ/F15 11.9552 Tf 12.225 0 Td [(1 and ~ 0 isthe n )]TJ/F15 11.9552 Tf 12.225 0 Td [(1 -dimensional vectorof0's.Thus,byTheorem5.23, Y 1 ;:::;Y n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 t ? Y n 3.Followsimmediatelyfrom1and2. 5.8The t and F Distributions 5.8.1The t distribution The t distributionariseswhenmakinginferenceaboutthemeanofaNormaldistribution. Let X 1 ;:::;X n i.i.d.N ; whereboth and areunknown,andsuppose ourgoalistoestimate ^ = X isasensibleestimator.Itssamplingdistributionis X N ;= p n or,equivalently, X )]TJ/F41 11.9552 Tf 11.955 0 Td [( = p n N ; 1

PAGE 343

5.8. T AND F 330 Wewouldliketousethisequationtotellushowaccuratelywecanestimate Apparentlywecanestimate towithinabout 2 = p n mostofthetime.Butthat's notanimmediatelyusefulstatementbecausewedon'tknow .Soweestimate by ^ = )]TJ/F41 11.9552 Tf 5.479 -9.683 Td [(n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 P X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X 2 1 = 2 andsay X )]TJ/F41 11.9552 Tf 11.955 0 Td [( ^ = p n N ; 1 ; approximately.Thissectionderivestheexactdistributionof X )]TJ/F41 11.9552 Tf 12.212 0 Td [( = ^ = p n and assesseshowgoodtheNormalapproximationis.WealreadyknowfromCorollary5.26that X ? ^ .Theorem5.28givesthedistributionof S 2 = n ^ 2 = P X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X 2 .Firstweneedalemma. Lemma5.27. Let V = V 1 + V 2 and W = W 1 + W 2 where V 1 ? V 2 and W 1 ? W 2 .If V and W havethesamedistribution,andif V 1 and W 1 havethesamedistribution,then V 2 and W 2 havethesamedistribution. Proof. Usingmomentgeneratingfunctions, M V 2 t = M V t =M V 1 t = M W t =M W 1 t = M W 2 t Theorem5.28. Let X 1 ;:::;X n i.i.d.N ; .Dene S 2 = P n i =1 X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X 2 .Then S 2 2 2 n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 : Proof. Let V = n X i =1 X i )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 :

PAGE 344

5.8. T AND F 331 Then V 2 n and V = n X i =1 X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X + X )]TJ/F41 11.9552 Tf 11.956 0 Td [( 2 = n X i =1 X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X 2 + n X )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 +2 X )]TJ/F41 11.9552 Tf 11.955 0 Td [( n X i =1 X i )]TJ/F15 11.9552 Tf 15.332 3.022 Td [( X = n X i =1 X i )]TJ/F15 11.9552 Tf 15.331 3.022 Td [( X 2 + X )]TJ/F41 11.9552 Tf 11.955 0 Td [( = p n 2 S 2 2 + V 2 where S 2 = 2 ? V 2 and V 2 2 1 .Butalso, V = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X i =1 X i )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 + X n )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 W 1 + W 2 where W 1 ? W 2 W 1 2 n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 and W 2 2 1 .Nowtheconclusionfollowsby Lemma5.27. Dene T r n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 n X )]TJ/F41 11.9552 Tf 11.955 0 Td [( ^ = p n = p n X )]TJ/F41 11.9552 Tf 11.956 0 Td [( = p S 2 = n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 2 : ThenbyCorollary5.26andTheorem5.28, T hasthedistributionof U= p V= n )]TJ/F15 11.9552 Tf 11.956 0 Td [(1 where U N ; 1 V 2 n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ,and U ? V .Thisdistributioniscalled the t distributionwith n )]TJ/F15 11.9552 Tf 11.255 0 Td [(1 degreesoffreedom.Wewrite T t n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 .Theorem5.29 derivesitsdensity. Theorem5.29. Let U N ; 1 V 2 p ,and U ? V .Then T U= p V=p has density p T t = \050 p +1 2 p p 2 \050 p 2 p )]TJ/F41 11.9552 Tf 5.479 -9.684 Td [(t 2 + p )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(p +1 2 = \050 p +1 2 \050 p 2 p p 1+ t 2 p )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(p +1 2 : Proof. Dene T = U p V=p and Y = V

PAGE 345

5.8. T AND F 332 Wemakethetransformation U;V T;Y ,ndthejointdensityof T;Y ,and thenthemarginaldensityof T .Theinversetransformationis U = TY 1 2 p p and V = Y TheJacobianis dU dT dU dY dV dT dV dY = Y 1 2 p p TY )]TJ/F40 5.9776 Tf 6.952 2.346 Td [(1 2 2 p p 01 = Y 1 2 p p Thejointdensityof U;V is p U;V u;v = 1 p 2 e )]TJ/F43 5.9776 Tf 7.782 3.258 Td [(u 2 2 1 \050 p 2 p 2 v p 2 )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F43 5.9776 Tf 7.782 3.258 Td [(v 2 : Thereforethejointdensityof T;Y is p T;Y t;y = 1 p 2 e )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(t 2 y p 1 \050 p 2 p 2 y p 2 )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(y 2 y 1 2 p p andthemarginaldensityof T is p T t = Z p T;Y t;y dy = 1 p 2 \050 p 2 p 2 p p Z 1 0 y p +1 2 )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(y 2 t 2 p +1 dy = 1 p \050 p 2 p +1 2 p p \050 p +1 2 2 p t 2 + p p +1 2 Z 1 0 1 \050 p +1 2 2 p t 2 + p p +1 2 y p +1 2 )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F43 5.9776 Tf 22.533 3.693 Td [(y 2 p= t 2 + p dy = \050 p +1 2 p p= 2 \050 p 2 p )]TJ/F41 11.9552 Tf 5.479 -9.684 Td [(t 2 + p )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(p +1 2 = \050 p +1 2 \050 p 2 p p 1+ t 2 p )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(p +1 2 :

PAGE 346

5.8. T AND F 333 Figure5.8.1showsthe t densityfor1,4,16,and64degreesoffreedom,and theN ; 1 density.Thetwopointstonoteare 1.The t densitiesareunimodalandsymmetricabout0,buthavelessmassin themiddleandmoremassinthetailsthantheN ; 1 density. 2.Inthelimit,as p !1 ,the t p densityappearstoapproachtheN ; 1 density. Theappearanceiscorrect.SeeExercise34. Figure5.8.1wasproducedwiththefollowingsnippet. x<-seq-5,5,length=100 dens<-cbinddtx,1,dtx,4,dtx,16,dtx,64, dnormx matplotx,dens,type="l",ylab="density",xlab="t", lty=c:5,1,col=1 legendx=-5,y=.4,lty=c:5,1, legend=cpaste"df=",c,4,16,64, "Normal" AtthebeginningofSection5.8.1wesaidthequantity p n X )]TJ/F41 11.9552 Tf 13.101 0 Td [( = ^ hada N ; 1 distribution,approximately.Theorem5.29derivesthedensityoftherelated quantity p n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 X )]TJ/F41 11.9552 Tf 9.767 0 Td [( = ^ whichhasa t n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 distribution,exactly.Figure5.8.1shows howsimilarthosedistributionsare.The t distributionhasslightlymorespreadthan theN ; 1 distribution,reectingthefactthat hastobeestimated.Butwhen n is large,i.e.when iswellestimated,thenthetwodistributionsarenearlyidentical. If T t p ,then E [ T ]= Z 1 t \050 p +1 2 \050 p 2 p p 1+ t 2 p )]TJ/F43 5.9776 Tf 7.782 3.693 Td [(p +1 2 dt .7 Inthelimitas t !1 ,theintegrandbehaveslike t )]TJ/F42 7.9701 Tf 6.587 0 Td [(p ;hence5.7isintegrableif andonlyif p> 1 .Thusthe t 1 distribution,alsoknownasthe Cauchydistribution hasnomean.When p> 1 E [ T ]=0 ,bysymmetry.Byasimilarargument,the t p distributionhasavarianceifandonlyif p> 2 .When p> 2 ,then Var T = p= p )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 .Ingeneral, T hasa k -thmoment E [ T k ] < 1 ifandonlyif p>k .

PAGE 347

5.8. T AND F 334 Figure5.13: t densitiesforfourdegreesoffreedomandtheN ; 1 density

PAGE 348

5.9.EXERCISES 335 5.8.2The F distribution 5.9Exercises 1.ProveTheorem5.4bymomentgeneratingfunctions. 2.RefertoTheorem5.8. aWhatwasthepointofthenexttolaststep? bJustifythelaststep. 3.Assumethatallplayersonabasketballteamare70%freethrowshooters andthatfreethrowsareindependentofeachother. aTheteamtakes40freethrowsinagame.Writedownaformulafor theprobabilitythattheymakeexactly37ofthem.Youdonotneedto evaluatetheformula. bTheteamtakes20freethrowsthenextgame.Writedownaformulafor theprobabilitythattheymakeexactly9ofthem. cWritedownaformulafortheprobabilitythattheteammakesexactly 37freethrowsintherstgameandexactly9inthesecondgame.That is,writeaformulafortheprobabilitythattheyaccomplishbothfeats. 4.Writedownthedistributionyouwouldusetomodeleachofthefollowing randomvariables.Beasspecicasyoucan.I.e.,insteadofansweringPoissondistribution,answerPoi orinsteadofansweringBinomial,answer Bin n;p where n =13 but p isunknown. aThetemperaturemeasuredatarandomlyselectedpointonthesurface ofMars. bThenumberofcaraccidentsinJanuaryatthecornerofBroadStreet andMainStreet. cOutof20peopleinapostofce,thenumberwho,whenexposedto anthraxspores,actuallydevelopanthrax. dOutof10,000peoplegivenasmallpoxvaccine,thenumberwhodevelop smallpox. eTheamountofMercuryinashcaughtinLakeOntario.

PAGE 349

5.9.EXERCISES 336 5.Astudenttypes dpois,1.5 into R R respondswith0.1255107. aWritedowninwordswhatthestudentjustcalculated. bWritedownamathematicalformulaforwhatthestudentjustcalculated. 6.Namethedistribution.YouranswersshouldbeoftheformPoi orN ; 22 etc.Usenumberswhenparametersareknown,symbolswhenthey'renot. Youspendtheeveningattheroulettetableinacasino.Youbetonred100 times.Eachtimethechanceofwinningis18/38.Ifyouwin,youwin$1; ifyoulose,youlose$1.Theaverageamountoftimebetweenbetsis90 seconds;thestandarddeviationis5seconds. athenumberoftimesyouwin bthenumberoftimesyoulose cthenumberofbetsuntilyourthirdwin dthenumberofbetsuntilyourthirtiethloss etheamountoftimetoplayyourrst40bets ftheadditionalamountoftimetoplayyournext60bets gthetotalamountoftimetoplayyour100bets hyournetprotattheendoftheevening itheamountoftimeuntilastrangerwearingaredcarnationsitsdown nexttoyou jthenumberoftimesyouareaccidentallyjostledbythepersonstanding behindyou 7.Agolferplaysthesamegolfcoursedailyforaperiodofmanyyears.Youmay assumethathedoesnotgetbetterorworse,thatallholesareequallydifcult andthattheresultsononeholedonotinuencetheresultsonanyother hole.Onanyonehole,hehasprobabilities.05,.5,and.45ofbeingunder par,exactlypar,andoverpar,respectively.Writedownwhatdistributionbest modelseachofthefollowingrandomvariables.Beasspecicasyoucan.I.e., insteadofanswering"Poissondistribution"answer"Poi "or"Poi where isunknown."Forsomepartsthecorrectanswermightbe"Idon'tknow." aX,thenumberofholesoverparon17September,2002 bW,thenumberofholesoverparinSeptember,2002

PAGE 350

5.9.EXERCISES 337 cY,thenumberofroundsoverparinSeptember,2002 dZ,thenumberoftimesheishitbylightninginthisdecade eH,thenumberofholes-in-onethisdecade fT,thetime,inyears,untilhisnexthole-in-one 8.DuringaCATscan,asourceyourbrainemitsphotonswhicharecounted byadetectorthemachine.Thedetectorismountedattheendofalong tube,soonlyphotonsthatheadstraightdownthetubearedetected.In otherwords,thoughthesourceemitsphotonsinalldirections,theonlyones detectedarethosethatareemittedwithinthesmallrangeofanglesthatlead downthetubetothedetector. LetXbethenumberofphotonsemittedbythesourcein5seconds.Suppose thedetectorcapturesonly1%ofthephotonsemittedbythesource.LetYbe thenumberofphotonscapturedbythedetectorinthosesame5seconds. aWhatisagoodmodelforthedistributionofX? bWhatistheconditionaldistributionofYgivenX? cWhatisthemarginaldistributionofY? Trytoanswerthesequestionsfromrstprinciples,withoutdoinganycalculations. 9.aProveTheorem5.11usingmomentgeneratingfunctions. bProveTheorem5.12usingmomentgeneratingfunctions. 10.aProveTheorem5.14bynding E [ Y 2 ] usingthetrickthatwasusedto proveTheorem5.12. bProveTheorem5.14bynding E [ Y 2 ] usingmomentgeneratingfunctions. 11.CaseStudy4.2.3inLarsenandMarx [addreference] claimsthatthenumber offumblesperteaminafootballgameiswellmodelledbyaPoisson.55 distribution.Forthisquiz,assumethatclaimiscorrect. aWhatistheexpectednumberoffumblesperteaminafootballgame? bWhatistheexpectedtotalnumberoffumblesbybothteams? cWhatisagoodmodelforthetotalnumberoffumblesbybothteams?

PAGE 351

5.9.EXERCISES 338 dInagameplayedin2002,Dukefumbled3timesandNavyfumbled4 times.WriteaformulaDon'tevaluateit.fortheprobabilitythatDuke willfumbleexactly3timesinnextweek'sgame. eWriteaformulaDon'tevaluateit.fortheprobabilitythatDukewill fumbleexactlythreetimesgiventhattheyfumbleatleastonce. 12.ClemsonUniversity,tryingtomaintainitssuperiorityoverDukeinACCfootball,recentlyaddedanewpracticeeldbyreclaimingafewacresofswamplandsurroundingthecampus.However,thecoachesandplayersrefused topracticethereintheeveningsbecauseoftheoverwhelmingnumberof mosquitos. TosolvetheproblemtheAthleticDepartmentinstalled10bugzappersaround theeld.Eachbugzapper,eachhour,zapsarandomnumberofmosquitos thathasaPoissondistribution. aWhatistheexactdistributionofthenumberofmosquitoszappedby10 zappersinanhour?Whatareitsexpectedvalueandvariance? bWhatisagoodapproximationtothedistributionofthenumberof mosquitoszappedby10zappersduringthecourseofa4hourpractice? cStartingfromyouranswertothepreviouspart,ndarandomvariable relevanttothisproblemthathasapproximatelyaN,1distribution. 13.BobisahighschoolseniorapplyingtoDukeandwantssomethingthatwill makehisapplicationstandoutfromalltheothers.Hegureshisbestchance toimpresstheadmissionsofceistoentertheGuinnessBookofWorld Recordsforthelongestamountoftimespentcontinuouslybrushingone's teethwithanelectrictoothbrush.Timeoutforchangingbatteriesispermissible.BatteriesforBob'stoothbrushlastanaverageof100minuteseach, withavarianceof100.Toprepareforhisassaultontheworldrecord,Bob laysinasupplyof100batteries. ThetelevisioncamerasarrivealongwithrepresentativesoftheGuinnesscompanyandtheAmericanDentalAssociationandBobbeginsthequestthathe hopeswillbethedeningmomentofhisyounglife.UnfortunatelyforBob hisquestendsinhumiliationashisbatteriesrunoutbeforehecanreachthe recordwhichcurrentlystandsat10,200minutes. Justiceiswellservedhoweverbecause,althoughBobdidtakeAPStatistics inhighschool,hewasnotaverygoodstudent.Hadhebeenagoodstatistics

PAGE 352

5.9.EXERCISES 339 studenthewouldhavecalculatedinadvancethechancethathisbatteries wouldrunoutinlessthan10,200minutes. Calculate,approximately,thatchanceforBob. 14.AnarticleonstatisticalfrauddetectionBoltonandHand[1992],when talkingaboutrecordsinadatabase,says: "Oneofthedifcultieswithfrauddetectionisthattypicallytherearemany legitimaterecordsforeachfraudulentone.Adetectionmethodwhichcorrectlyidenties99%ofthelegitimaterecordsaslegitimateand99%ofthe fraudulentrecordsasfraudulentmightberegardedasahighlyeffectivesystem.However,ifonly1in1000recordsisfraudulent,then,onaverage,in every100thatthesystemagsasfraudulent,onlyabout9willinfactbeso." QUESTION:Canyoujustifythe"about9"? 15. [credittoFPPhere,orchangethequestion.] In1988menaveraged around500onthemathSAT,theSDwasaround100andthehistogram followedthenormalcurve. aEstimatethepercentageofmengettingover600onthistestin1988. bOneofthemenwhotookthetestin1988willbepickedatrandom,and youhavetoguesshistestscore.Youwillbegivenadollarifyouguess itrighttowithin50points. i.Whatshouldyouguess? ii.Whatisyourchanceofwinning? 16.Multiplechoice. a X Poi Pr[ X 7]= i. P 7 x = e )]TJ/F42 7.9701 Tf 6.586 0 Td [( x =x ii. P 7 x =0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [( x =x iii. P 7 =0 e )]TJ/F42 7.9701 Tf 6.587 0 Td [( x =x b X and Y aredistributeduniformlyontheunitsquare. Pr[ X : 5 j Y : 25]= i. : 5 ii. : 25 iii.can'ttellfromtheinformationgiven.

PAGE 353

5.9.EXERCISES 340 c X Normal ; 2 Pr[ X> + ] i.ismorethan.5 ii.islessthan.5 iii.can'ttellfromtheinformationgiven. d X 1 ;:::;X 100 N ; 1 X X 1 + + X 100 = 100 Y X 1 + + X 100 Calculate i. Pr[ )]TJ/F41 11.9552 Tf 9.298 0 Td [(: 2 X : 2] ii. Pr[ )]TJ/F41 11.9552 Tf 9.298 0 Td [(: 2 X i : 2] iii. Pr[ )]TJ/F41 11.9552 Tf 9.298 0 Td [(: 2 Y : 2] iv. Pr[ )]TJ/F15 11.9552 Tf 9.298 0 Td [(2 X 2] v. Pr[ )]TJ/F15 11.9552 Tf 9.298 0 Td [(2 X i 2] vi. Pr[ )]TJ/F15 11.9552 Tf 9.298 0 Td [(2 Y 2] vii. Pr[ )]TJ/F15 11.9552 Tf 9.298 0 Td [(20 X 20] viii. Pr[ )]TJ/F15 11.9552 Tf 9.298 0 Td [(20 X i 20] ix. Pr[ )]TJ/F15 11.9552 Tf 9.298 0 Td [(20 Y 20] e X Bin ; P 100 =0 f x j = i.1 ii.thequestiondoesn'tmakesense iii.can'ttellfromtheinformationgiven. f X and Y havejointdensity f x;y ontheunitsquare. f x = i. R 1 0 f x;y dx ii. R 1 0 f x;y dy iii. R x 0 f x;y dy g X 1 ;:::;X n Gamma r; andaremutuallyindependent. f x 1 ;:::;x n = i. [ r = r )]TJ/F15 11.9552 Tf 11.956 0 Td [(1!] Q x i r )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [( P x i ii. [ nr = r )]TJ/F15 11.9552 Tf 11.955 0 Td [(1! n ] Q x i r )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [( Q x i iii. [ nr = r )]TJ/F15 11.9552 Tf 11.955 0 Td [(1! n ] Q x i r )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 e )]TJ/F42 7.9701 Tf 6.586 0 Td [( P x i 17.InFigure5.2,theplotslookincreasinglyNormalaswegodowneachcolumn. Why?Hint:awell-knowntheoremisinvolved. 18.ProveTheorem5.7.

PAGE 354

5.9.EXERCISES 341 19.ProveaversionofEquation5.1onpage291.Let k =2 .Startfromthejoint pmfof Y 1 and Y 2 UseTheorem5.7.,derivethemarginalpmfof Y 1 ,and identifyit. 20. RongelapIsland,Poissondistribution 21. seedrain,Poissondistribution 22.aLet Y U ;n wheretheparameter n isanunknownpositiveinteger. Supposeweobserve Y =6 .Findthem.l.e. ^ n .Hint: Equation5.2denes thepmffor y 2f 1 ; 2 ;:::;n g .Whatis p y when y 62f 1 ; 2 ;:::;n g ? bInWorldWarII,whenGermantankscamefromthefactorytheyhad serialnumberslabelledconsecutivelyfrom1.I.e.,thenumberswere1, 2,....TheAllieswantedtoestimate T ,thetotalnumberofGerman tanksandhad,asdata,theserialnumbersofthetankstheyhadcaptured.Assumingthattankswerecapturedindependentlyofeachother andthatalltankswereequallylikelytobecapturedndthem.l.e. ^ T 23.Let Y beacontinuousrandomvariable, Y U a;b aFind E [ Y ] bFind Var Y cFind M Y t 24.aIsthereadiscretedistributionthatisuniformonthepositiveintegers? Whyorwhynot?Ifthereissuchadistributionthenwemightcallit U ; 1 bIsthereacontinuousdistributionthatisuniformontherealline?Why orwhynot?Ifthereis,thenwemightcallitU ; 1 25.Let x Gam ; andlet y =1 =x .Findthepdfof y .Wesaythat y hasan inverseGamma distributionwithparameters and andwrite y invGam ; 26.ProveTheorem5.17. Hint :UsethemethodofTheorem5.12. 27.Let x 1 ;:::;x n i.i.d.U ; 1 .Findthedistributionof x n ,thelargestorder statistic. 28.Inthe R codetocreateFigure2.19,explainhowtouse dgamma... instead of dpois... .

PAGE 355

5.9.EXERCISES 342 29.Provetheclaimonpage311thatthehalf-lifeofaradioactiveisotopeis m = log2 30.ProveTheorem5.15. 31.ProveTheorem5.18. 32.Page320showsthatthe n -dimensionalNormaldensitywithadiagonalcovariancematrixistheproductof n separateunivariateNormaldensities.In thisproblemyouaretoworkintheoppositedirection.Let X 1 ,..., X n be independentNormallydistributedrandomvariableswithmeans 1 ,..., n andSD's 1 ,..., n aWritedownthedensityofeach X i bWritedownthejointdensityof X 1 ,..., X n cShowthatthejointdensitycanbewrittenintheformofEquation5.6. dDerivethemeanvectorandcovariancematrixof X 1 ,..., X n 33.ProveCorollary5.25. 34.Show,forevery x 2 R lim p !1 p t p x = 1 p 2 e )]TJ/F43 5.9776 Tf 7.782 3.258 Td [(x 2 2 where p t p x isthe t densitywith p degreesoffreedom,evaluatedat x .Hint: useSterling'sformula.ThisproblemisExercise5.18cin StatisticalInference,2nded. byCasellaandBerger. 35.Itturnsoutthatthe t distributionwith p degreesoffreedomcanbewrittenas amixtureofNormaldistributions,afactthatissometimesusefulinstatistical calculations.Let Gam p= 2 ; 2 and,conditionalon y N ; 1 = p Showthatthemarginaldistributionof y isthe t distributionwith p degrees offreedom.

PAGE 356

C HAPTER 6 B AYESIAN S TATISTICS 6.1MultidimensionalBayesianAnalysis ThischaptertakesupBayesianstatistics.ModernBayesianstatisticsreliesheavily oncomputers,computation,programming,andalgorithms,sothatwillbethe majorfocusofthischapter.Wecannotgiveacompletetreatmenthere,butthere areseveralgoodbooksthatcoverthesetopicsinmoredepth.See,forexample, Gelmanetal.[2004],Liu[2004],MarinandRobert[2007],orRobertandCasella [1997]. RecalltheframeworkofBayesianinferencefromSection2.5. Wepositaparametricfamilyofdistributions f p y j g Weexpressouroldknowledgeof throughapriorprobabilitydensity p Theprevioustwoitemscombinetoyield p y; and,ultimately, p j y Theposteriordensity p j y representsournewstateofknowledgeabout Theposteriordensityis p j y = p p y j R p p y j d / p p y j : .1 Sofar,sogood.Butinmanyinterestingapplications, ismulti-dimensional andproblemsarisewhenwewanttoexaminetheposterior.Equation6.1tellsus howtoevaluatetheposterioratanyvalueof ,butthat'snotalwayssufcient forgettingasenseofwhichvaluesof aremostlikely,somewhatlikely,unlikely, 343

PAGE 357

6.1.MULTIDIMENSIONALBAYESIANANALYSIS 344 etc.Onewaytodevelopafeelingforamultidimensionalposterioristoexaminea marginalposteriordensity,say p 1 j y = Z Z p 1 ;:::; k j y d 2 :::d k : .2 Unfortunately,theintegralinEquation6.2isoftennotanalyticallytractableand mustbeintegratednumerically.Standardnumericalintegrationtechniquessuch asquadraturemayworkwellinlowdimensions,butinBayesianstatisticsEquation6.2isoftensufcientlyhighdimensionalthatstandardtechniquesareunreliable.Therefore,newnumericalintegrationtechniquesareneeded.ThemostimportantoftheseiscalledMarkovchainMonteCarlointegration,orMCMC.Other techniquescanbefoundinthereferencesatthebeginningofthechapter.Forthe purposesofthisbook,weinvestigateMCMC.Butrst,togetafeelforBayesian analysis,weexploreposteriorsinlowdimensional,numericallytractableexamples. Thegeneralsituationisthattherearemultipleparameters 1 ,..., k ,anddata y 1 ,..., y n .Wemaybeinterestedinmarginal,conditional,orjointdistributionsof theparameterseither apriori or aposteriori .Someexamples: p 1 ;:::; k ,thejointprior p 1 ;:::; k j y 1 ;:::;y n ,thejointposterior p 1 j y 1 ;:::;y n = R R p 1 ;:::; n j y 1 ;:::;y n d 2 d k ,themarginalposteriorof 1 p 2 ;:::; k j 1 ;y 1 ;:::;y n = p 1 ;:::; k j y 1 ;:::;y n =p 1 j y 1 ;:::;y n / p 1 ;:::; k j y 1 ;:::;y n ; theconditionaljointposteriordensityof 2 ;:::; k given 1 ,wherethe / meansthatwesubstitute 1 intothenumeratorandtreatthedenominatoras aconstant. Theexamplesinthissectionillustratetheideasandshowshowtodolow-dimensional integralsin R Example6.1 PineCones OnepossibleresultofincreasedCO 2 intheatmosphereisthatplantswillusesomeof theexcesscarbonforreproduction,insteadofgrowth.Theymay,forexampleproduce

PAGE 358

6.1.MULTIDIMENSIONALBAYESIANANALYSIS 345 ringIDxcoorycoorspecdbh199819992000 1110030.710.53pita19.4000 1110041.262.36pita14.1004 1110111.446.16pita19.4060 1110133.565.84pita21.6000 1110173.758.08pita10.8000 . 6680530.8210.73pita14.4000 668055-2.2413.34pita11000 668057-0.7814.21pita8000 6680580.7614.55pita10.6000 6680591.4813pita21.20510 Table6.1:ThenumbersofpineconesontreesintheFACEexperiment,1998. moreseeds,producebiggerseeds,produceseedsearlierinlife,orproduceseedswhen they,theplants,aresmaller.ToinvestigatethispossibilityintheDukeFACEexperiment SeeExample1.12anditssequels.agraduatestudentwenttotheFACEsiteeachyear andcountedthenumberofpineconesonpinetreesinthecontrolandtreatmentplots. citationhereandExample3.8 ThedataareinTable6.1.Therstcolumnis ring .Ringsa,b,cwerecontrol;x,y,z weretreatment.Thenextcolumn, ID ,identieseachtreeuniquely; xcoor and ycoor givethelocationofthetree.Thenextcolumn, spec ,givesthespecies; pita standsfor pinustaeda ,orloblollypine,thedominantcanopytreeintheFACEexperiment.The column dbh givesdiameteratbreastheight,acommonwayforforestersandecologists tomeasurethesizeofatree.Thenalthreecolumnsshowthenumberofpineconesin 1998,1999,and2000.Inthisexamplewe'lllookatthedatafortheyear2000.Wewant therelationshipbetweendbhandthenumberofpinecones,andwhetherthatrelationship isthesameinthecontrolandtreatmentplots. Figures6.1,6.2,and6.3plotthenumbersofpineconesasafunctionofdbhinthe years1998.In1998,veryfewtreeshadpineconesandthosethatdidhadvery few.Butby1999,manymoretreeshadconesandhadthemingreaternumber.There doesnotappeartobeasubstantialdierencebetween1999and2000.Asaquickcheck ofourvisualimpressionwecancountthefractionofpinetreeshavingpineconeseach year,byring.Thefollowing R codedoesthejob. foriin1:6{ good<-cones$ring==i

PAGE 359

6.1.MULTIDIMENSIONALBAYESIANANALYSIS 346 printcsumcones$X1998[good]>0/sumgood, sumcones$X1999[good]>0/sumgood, sumcones$X2000[good]>0/sumgood } [1]0.00000000.15625000.2083333 [1]0.056338030.366197180.32394366 [1]0.018348620.211009170.27522936 [1]0.059829060.393162390.37606838 [1]0.019230770.105769230.22115385 [1]0.040816330.197278910.18367347 Sincethere'snotmuchactionin1998wewillignorethedatafromthatyear.Thedata showagreatercontrastbetweentreatmentrings2,3,4andcontrolrings1,5,6in 1999thanin2000.Soforthepurposeofthisexamplewe'llusethedatafrom1999.A goodscienticinvestigation,though,wouldusedatafromallyears. We'relookingforamodelwithtwofeatures:theprobabilityofconesisan increasingfunctionofdbhandofthetreatmentandgiventhatatreehascones, thenumberofconesisanincreasingfunctionofdbhandtreatment.Herewedescribe asimplemodelwiththesefeatures.Theideaisalogisticregressionwithcovariates dbhandtreatmentfortheprobabilitythatatreeissexuallymatureandaPoisson regressionwithcovariatesdbhandtreatmentforthenumberofconesgiventhatatree issexuallymature.Let Y i bethenumberofconesonthe i 'thtree.Ourmodelis x i = 1 ifthe i 'thtreehadextraCO 2 0 otherwise i = 1 ifthe i 'thtreeissexuallymature 0 otherwise i =P[ i =1]= exp 0 + 1 dbh i + 2 x i 1+exp 0 + 1 dbh i + 2 x i i =exp 0 + 1 dbh i + 2 x i Y i Poi i i Therearesixunknownparameters: 0 1 2 0 1 2 .Wemustassignpriordistributionsandcomputeposteriordistributionsoftheseparameters.Inaddition,each treehasanindicator i andwewillbeabletocalculatetheposteriorprobabilities P[ i =1 j y 1 ;:::;y n ] for i =1 ;:::;n .

PAGE 360

6.1.MULTIDIMENSIONALBAYESIANANALYSIS 347 Figure6.1:Numbersofpineconesin1998asafunctionofdbh

PAGE 361

6.1.MULTIDIMENSIONALBAYESIANANALYSIS 348 Figure6.2:Numbersofpineconesin1999asafunctionofdbh

PAGE 362

6.1.MULTIDIMENSIONALBAYESIANANALYSIS 349 Figure6.3:Numbersofpineconesin2000asafunctionofdbh

PAGE 363

6.1.MULTIDIMENSIONALBAYESIANANALYSIS 350 Westartwiththepriors 0 ; 1 ; 2 ; 0 ; 1 ; 2 i.i.d.U )]TJ/F15 11.9552 Tf 9.299 0 Td [(100 ; 100 .Thispriordistributionisnot,obviously,basedonanysubstantivepriorknowledge.Insteadofarguingthat thisisasensibleprior,wewilllaterchecktherobustnessofconclusionstospecication oftheprior.Iftheconclusionsarerobust,thenwewillarguethatalmostanysensible priorwouldleadtoroughlythesameconclusions. Tobegintheanalysiswewritedownthejointdistributionofparametersanddata. Reformatthisequation. p y 1 ;:::;y n ; 0 ; 1 ; 2 ; 0 ; 1 ; 2 = p 0 ; 1 ; 2 ; 0 ; 1 ; 2 p y 1 ;:::;y n j 0 ; 1 ; 2 ; 0 ; 1 ; 2 = 1 200 6 1 )]TJ/F39 7.9701 Tf 6.587 0 Td [(100 ; 100 0 1 )]TJ/F39 7.9701 Tf 6.586 0 Td [(100 ; 100 1 1 )]TJ/F39 7.9701 Tf 6.587 0 Td [(100 ; 100 2 1 )]TJ/F39 7.9701 Tf 6.587 0 Td [(100 ; 100 0 1 )]TJ/F39 7.9701 Tf 6.587 0 Td [(100 ; 100 1 1 )]TJ/F39 7.9701 Tf 6.587 0 Td [(100 ; 100 2 Y i : y i > 0 exp 0 + 1 dbh i + 2 x i 1+exp 0 + 1 dbh i + 2 x i exp )]TJ/F15 11.9552 Tf 11.291 0 Td [(exp 0 + 1 dbh i + 2 x i exp 0 + 1 dbh i + 2 x i y i y i Y i : y i =0 1 1+exp 0 + 1 dbh i + 2 x i + exp 0 + 1 dbh i + 2 x i 1+exp 0 + 1 dbh i + 2 x i exp )]TJ/F15 11.9552 Tf 11.291 0 Td [(exp 0 + 1 dbh i + 2 x i .3 InEquation6.3eachtermintheproduct Q i : y i > 0 is P[ i 'thtreeissexuallymature ] p y i j i 'thtreeissexuallymature whileeachtermin Q i : y i =0 is P[ i 'thtreeisimmature ]+P[ i 'thtreeismaturebutproducesnocones ] : Theposterior p 0 ; 1 ; 2 ; 0 ; 1 ; 2 j y 1 ;:::;y n isproportional,asafunctionof 0 ; 1 ; 2 ; 0 ; 1 ; 2 ,toEquation6.3.Similarly,conditionalposteriorssuchas p 0 j 1 ; 2 ; 0 ; 1 ; 2 ;y 1 ;:::;y n areproportional,asafunctionof 0 ,toEquation6.3. Butthatdoesn'tallowformuchsimplication;itallowsusonlytoignorethefactorials inthedenominator. Tolearnabouttheposteriorin,say,Equation6.3itiseasytowritean R function thataccepts 0 ; 1 ; 2 ; 0 ; 1 ; 2 asinputandreturns6.3asoutput.Butthat'squite acomplicatedfunctionof 0 ; 1 ; 2 ; 0 ; 1 ; 2 andit'snotobvioushowtousethe functionorwhatitsaysaboutanyofthefourparameters.Therefore,inSection6.2 wepresentanalgorithmthatisverypowerfulforevaluatingtheintegralsthatoften ariseinmultivariateBayesiananalyses.

PAGE 364

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 351 6.2TheMetropolis,Metropolis-Hastings,andGibbs SamplingAlgorithms InMarkovchainMonteCarlo,thetermMonteCarloreferstoevaluatingan integralbyusingmanyrandomdrawsfromadistribution.Toxideas,supposewe wanttoevaluateEquation6.2andlet ~ = 1 ;:::; k .Ifwecouldgeneratemany samples ~ 1 ;:::; ~ M of ~ where ~ i = i; 1 ;:::; i;k fromitsposteriordistribution thenwecouldapproximateEquation6.2by 1.discarding i; 2 ;:::; i;k fromeachiteration, 2.retaining 1 ; 1 ;:::; M; 1 3.using 1 ; 1 ;:::; M; 1 andstandarddensityestimationtechniquespage105to estimate p 1 j y ,or 4.foranyset A ,using numberof i; 1 'sin A M asanestimateof P[ 1 2 A j y ] That'stheideabehindMonteCarlointegration. ThetermMarkovchainreferstohowthesamples ~ 1 ;:::; ~ M areproduced. InaMarkovchainthereisa transitiondensity or transitionkernel k ~ i j ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 which isadensityforgenerating ~ i given ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 .Werstchoose ~ 1 almostarbitrarily,then generate ~ 2 j ~ 1 ~ 3 j ~ 2 ,andsoon,insuccession,forasmanystepsaswelike. Each ~ i hasadensity p i p ~ i whichdependson ~ 1 andthetransitionkernel.But, 1.undersomefairlybenignconditionsSeethereferencesatthebeginningof thechapterfordetails.thesequence p 1 p 2 ,...convergestoalimit p ,the stationarydistribution ,thatdoesnotdependon ~ 1 ; 2.thetransitiondensity k ~ i j ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 canbechosensothatthestationarydistribution p isequalto p ~ j y ; 3.wecanndan m suchthat i>m p i p = p ~ j y ; 4.then ~ m +1 ,..., ~ M are,approximately,asamplefrom p ~ j y .

PAGE 365

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 352 TheMetropolis-Hastingsalgorithm[Metropolisetal.,1953,Hastings,1970]is onewaytoconstructanMCMCalgorithmwhosestationarydistributionis p ~ j y Itworksaccordingtothefollowingsteps. 1.Chooseaproposaldensity g ~ j ~ 2.Choose ~ 1 3.For i =2 ; 3 ;::: Generateaproposal ~ from g ~ j ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 Set r min 1 ; p ~ j y g ~ i )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 j ~ p ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 j y g ~ j ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 : .4 Set ~ i = ~ withprobability r ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 withprobability 1 )]TJ/F41 11.9552 Tf 11.955 0 Td [(r Steps1denethetransitionkernel k .InmanyMCMCchains,theacceptance probability r maybestrictlylessthanone,sothekernel k isamixtureoftwoparts: onethatgeneratesanewvalueof ~ i +1 6 = ~ i andonethatsets ~ i +1 = ~ i ToillustrateMCMC,supposewewanttogenerateasample 1 ;:::; 10 ; 000 from theBe ; 2 distribution.Wearbitrarilychooseaproposaldensity g j = U )]TJ/F41 11.9552 Tf -419.1 -14.446 Td [(: 1 ; + : 1 andarbitrarilychoose 1 =0 : 5 .Thefollowing R codedrawsthesample. samp<-repNA,10000 samp[1]<-0.5 foriin2:10000{ prev<-samp[i-1] thetastar<-runif1,prev-.1,prev+.1 r<-min1,dbetathetastar,5,2/dbetaprev,5,2 ifrbinom1,1,r==1 new<-thetastar else new<-prev samp[i]<-new }

PAGE 366

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 353 Figure6.4:10,000MCMCsamplesoftheBe ; 2 density. Toppanel :histogramof samplesfromtheMetropolis-HastingsalgorithmandtheBe ; 2 density. Middle panel : i plottedagainst i Bottompanel : p i plottedagainst i .

PAGE 367

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 354 ThetoppanelofFigure6.4showstheresult.ThesolidcurveistheBe ; 2 density andthehistogramismadefromtheMetropolis-Hastingssamples.Theymatch closely,showingthatthealgorithmperformedwell. Figure6.4wasproducedby parmfrow=c,1 histsamp[-:1000],prob=TRUE,xlab=expressiontheta, ylab="",main="" x<-seq,1,length=100 linesx,dbetax,5,2 plotsamp,pch=".",ylab=expressiontheta plotdbetasamp,5,2,pch=".",ylab=expressiontheta Thecode samp[-:1000] discardstherst1000drawsinthehopethatthesamplerwillhaveconvergedtoitsstationarydistributionafter1000iterations. Assumingthatconvergenceconditionshavebeenmetandthatthealgorithmis well-constructed,MCMCchainsareguaranteedeventuallytoconvergeanddeliver samplesfromthedesireddistribution.Buttheguaranteeisasymptoticandinpracticetheoutputfromthechainshouldbecheckedtodiagnosepotentialproblems thatmightariseinnitesamples. Themainthingtocheckis mixing .AnMCMCalgorithmoperatesinthespace of ~ .Ateachiterationofthechain,i.e.,foreachvalueof i ,thereisacurrent location ~ i .Atthenextiterationthechainmovestoanewlocation ~ i .Inthis waythechainexploresthe ~ space.Whileitisexploringitalsoevaluates p ~ i .In theory,thechainshouldspendmanyiterationsatvaluesof ~ where p ~ islarge andhencedelivermanysamplesof ~ 'swithlargeposteriordensityandfew iterationsatvalueswhere p ~ issmall.Forthechaintodoitsjobitmustndthe modeormodesof p ~ ,itmustmovearoundintheirvicinity,anditmustmove betweenthem.Theprocessofmovingfromonepartofthespacetoanotheris called mixing ThemiddleandbottompanelsofFigure6.4illustratemixing.Themiddlepanel plots i vs. i .Itshowsthatthechainspendsmostofitsiterationsinvaluesof betweenabout0.6and0.9butmakesoccasionalexcursionsdownto0.4or0.2 orso.Aftereachexcursionitcomesbacktothemodearound0.8.Thechainhas takenmanyexcursions,soithasexploredthespacewell.Thebottompanelplots p i vs. i .Itshowsthatthechainspentmostofitstimenearthemodewhere

PAGE 368

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 355 p 2 : 4 butmademultipleexcursionsdowntoplaceswhere p isaround0.5, orevenless.Thischainmixedwell. Toillustratepoormixingwe'llusethesameMCMCalgorithmbutwithdifferent proposalkernels.Firstwe'lluse j = U )]TJ/F15 11.9552 Tf 13.226 0 Td [(100 ; +100 andchangethe correspondinglineofcodeto thetastar<-runif1,prev-100,prev+100 .Thenwe'lluse j = U )]TJ/F41 11.9552 Tf 11.955 0 Td [(: 00001 ; + : 00001 andchangethecorrespondinglineofcodeto thetastar<-runif1,prev-.00001,prev+.00001 .Figure6.5shows theresult.Theleft-handsideofthegureisfor j = U )]TJ/F15 11.9552 Tf 12.053 0 Td [(100 ; +100 .The toppanelshowsaverymuchrougherhistogramthanFigure6.4;themiddleand bottompanelsshowwhy.Theproposalradiusissolargethatmostproposalsare rejected;therefore, i +1 = i formanyiterations;thereforewegettheatspotsin themiddleandbottompanels.Theplotsrevealthatthesamplerexploredfewer than30separatevaluesof .That'stoofew;thesamplerhasnotmixedwell.In contrast,theright-handsideofthegurefor j = U )]TJ/F41 11.9552 Tf 9.76 0 Td [(: 00001 ; + : 00001 showsthat hasdriftedsteadilydownward,butoveraverysmallrange.Thereare noatspots,sothesamplerisacceptingmostproposals,buttheproposalradius issosmallthatthesamplerhasn'tyetexploredmostofthespace.Ittoohasnot mixedwell. PlotssuchasthemiddleandbottomplotsofFigure6.5arecalled trace plots becausetheytracethepathofthesampler. Inthisproblem,goodmixingdependsongettingtheproposalradiusnottoo largeandnottoosmall,butjustright.Tobesure,ifweruntheMCMCchainlong enough,allthreesamplerswouldyieldgoodsamplesfromBe ; 2 .Buttherst samplermixedwellwithonly10,000iterationswhiletheotherswouldrequire manymoreiterationstoyieldagoodsample.Inpractice,onemustexaminethe outputofone'sMCMCchaintodiagnosemixingproblems.Nodiagnosticsarefool proof,butnotdiagnosingisfoolhardy. SeveralspecialcasesoftheMetropolis-Hastingsalgorithmdeserveseparate mention. Metropolisalgorithm Itisoftenconvenienttochoosetheproposaldensity g ~ j ~ tobesymmetric;i.e.,sothat g ~ j ~ = g ~ j ~ .InthiscasetheMetropolis ratio p ~ j y g ~ i )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 j ~ =p ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 j y g ~ j ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 simpliesto p ~ j y =p ~ i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 j y That'swhathappenedintheBe ; 2 illustrationandwhytheline r<-min1,dbetathetastar,5,2/dbetaprev,5,2 doesn'tinvolve g Independencesampler Itmaybeconvenienttochoose g ~ j ~ = g ~ notde-

PAGE 369

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 356 Figure6.5:10,000MCMCsamplesoftheBe ; 2 density. Leftcolumn : j = U )]TJ/F15 11.9552 Tf 12.31 0 Td [(100 ; +100 ; Rightcolumn : j = U )]TJ/F41 11.9552 Tf 12.311 0 Td [(: 00001 ; + : 00001 Top :histogramofsamplesfromtheMetropolis-HastingsalgorithmandtheBe ; 2 density. Middle : i plottedagainst i Bottom : p i plottedagainst i .

PAGE 370

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 357 pendenton ~ .Forexample,wecouldhaveused thetastar<-runif in theBe ; 2 illustration. Multipletransitionkernels Wemayconstructmultipletransitionkernels,say g 1 ..., g m .ThenforeachiterationoftheMCMCchainwecanrandomlychoose j 2f 1 ;:::;m g andmakeaproposalaccordingto g j .Wewoulddothiseither forconvenienceortoimprovetheconvergencerateandmixingpropertiesof thechain. Gibbssampler [GemanandGeman,1984]Inmanypracticalexamples,thesocalled fullconditionals or completeconditionals p j j ~ )]TJ/F42 7.9701 Tf 6.587 0 Td [(j ;y areknownand easytosampleforall j ,where )]TJ/F42 7.9701 Tf 6.586 0 Td [(j = 1 ;:::; j )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ; j +1 ;:::; k .Inthiscase wemaysample i;j from p j j i; 1 ;:::; i;j )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ; i )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ;j +1 ;:::; i )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ;k for j =1 ;:::;k andset ~ i = i; 1 ;:::; i;k Wewoulddothisforconvenience. ThenextexampleillustratesseveralMCMCalgorithmsonthepineconedataof Example6.1. Example6.2 PineCones,cont InthisexamplewetryseveralMCMCalgorithmstoevaluateanddisplaytheposterior distributioninEquation6.3.Throughoutthisexample,weshall,forcompactness,refer totheposteriordensityas p ~ insteadof p ~ j y 1 ;:::;y n Firstweneedfunctionstoreturnthepriordensityandthelikelihoodfunction. dprior<-functionparams,log=FALSE{ logprior<-dunifparams["b0"],-100,100,log=TRUE +dunifparams["b1"],-100,100,log=TRUE +dunifparams["b2"],-100,100,log=TRUE +dunifparams["g0"],-100,100,log=TRUE +dunifparams["g1"],-100,100,log=TRUE +dunifparams["g2"],-100,100,log=TRUE iflogreturnlogprior elsereturnexplogprior } lik<-functionparams,n.cones=cones$X2000,dbh=cones$dbh, trt=cones$trt,log=FALSE{ zero<-n.cones==0

PAGE 371

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 358 tmp1<-params["b0"]+params["b1"]*dbh+params["b2"]*trt tmp2<-params["g0"]+params["g1"]*dbh+params["g2"]*trt etmp1<-exptmp1 etmp2<-exptmp2 loglik<-sumtmp1[!zero] -sumetmp2[!zero] +sumn.cones[!zero]*tmp2[!zero] +sumlog1+etmp1[zero]*exp-etmp2[zero] -sumlog1+etmp1 iflogreturnloglik elsereturnexploglik } Nowwewriteaproposalfunction.Thisonesmakes ~ j ~ N ~ ;: 1 I 6 ; where I 6 isthe 6 6 identitymatrix. g.all<-functionparams{ sig<-c.1,.1,.1,.1,.1,.1 proposed<-mvrnorm1,mu=params,Sigma=diagsig returnlistproposed=proposed,ratio=1 } Finallywewritethemainpartofthecode.Trytounderstandit;youmayhavetowrite somethingsimilar.Noticeaninterestingfeatureof R :assigningnamestothecomponents of params allowsustorefertothecomponentsbynameinthe lik function. #initialvalues params<-c"b0"=0,"b1"=0,"b2"=0,"g0"=0,"g1"=0,"g2"=0 #numberofiterations mc<-10000 #storageforoutput mcmc.out<-matrixNA,mc,lengthparams+1 #themainloop

PAGE 372

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 359 foriin1:mc{ prop<-g.allparams new<-prop$proposed log.accept.ratio<-dpriornew,log=TRUE -dpriorparams,log=TRUE +liknew,log=TRUE -likparams,log=TRUE -logprop$ratio accept.ratio<-min1,explog.accept.ratio ifas.logicalrbinom,1,accept.ratio params<-new mcmc.out[i,]<-cparams,likparams,log=TRUE } Figure6.6showstraceplotsoftheoutput.Theplotsshowthatthesamplerdidnot moveveryoften;itdidnotmixwellanddidnotexplorethespaceeectively. Figure6.6wasproducedbythefollowingsnippet. parmfrow=c,2,mar=c,4,1,1+.1 foriin1:6 plotmcmc.out[,i],ylab=namesparams[i],pch="." plotmcmc.out[,7],ylab=expressionptheta,pch="." Whensamplersgetstuck,sometimesit'sbecausetheproposalradiusistoolarge.So nextwetryasmallerradius: sig<-rep.01,6 .Figure6.7showstheresult.The samplerisstillnotmixingwell.Theparameter 0 travelledfromitsstartingpointof 0 =0 toabout 0 )]TJ/F15 11.9552 Tf 22.155 0 Td [(1 : 4 orso,thenseemedtogetstuck;otherparametersbehaved similarly.Let'stryrunningthechainformoreiterations: mc<-100000 .Figure6.8 showstheresult.Again,thesamplerdoesnotappeartohavemixedwell.Parameters 0 and 1 ,forexample,havenotyetsettledintoanysortofsteady-statebehaviorand p ~ seemstobesteadilyincreasing,indicatingthatthesamplermaynotyethavefoundthe posteriormode.

PAGE 373

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 360 Figure6.6:TraceplotsofMCMCoutputfromthepineconecodeonpage358.

PAGE 374

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 361 Figure6.7:TraceplotsofMCMCoutputfromthepineconecodewithasmaller proposalradius.

PAGE 375

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 362 Figure6.8:TraceplotsofMCMCoutputfromthepineconecodewithasmaller proposalradiusand100,000iterations.Theplotsshowevery10'thiteration.

PAGE 376

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 363 ItisnotalwaysnecessarytoploteveryiterationofanMCMCsampler.Figure6.8 plotsevery10'thiteration;plotsofeveryiterationlooksimilar.Thegurewasproduced bythefollowingsnippet. parmfrow=c,2,mar=c,4,1,1+.1 plotem<-seq1,100000,by=10 foriin1:6 plotmcmc.out[plotem,i],ylab=namesparams[i],pch="." plotmcmc.out[plotem,7],ylab=expressionptheta,pch="." Thesamplerisn'tmixingwell.towriteabetteroneweshouldtrytounderstandwhy thisoneisfailing.Itcouldbethatproposingachangeinallparameterssimultaneouslyis toodramatic,thatoncethesamplerreachesalocationwhere p ~ islarge,changingall theparametersatonceislikelytoresultinalocationwhere p ~ issmall,thereforethe acceptanceratiowillbesmall,andtheproposalwilllikelyberejected.Toamelioratethe problemwe'lltryproposingachangetoonlyoneparameteratatime.Thenewproposal functionis g.one<-functionparams{ sig<-c"b0"=.1,"b1"=.1,"b2"=.1,"g0"=.1,"g1"=.1,"g2"=.1 which<-samplenamesparams,1 proposed<-params proposed[which]<-rnorm1,mean=params[which],sd=sig[which] returnlistproposed=proposed,ratio=1 } whichrandomlychoosesoneofthesixparametersandproposestoupdatethatparameter only.Naturally,weeditthemainlooptouse g.one insteadof g.all .Figure6.9shows theresult.Thisisstartingtolookbetter.Parameters 2 and 2 areexhibitingsteadystatebehavior;soare 0 and 1 ,afteriteration10,000orso x =1000 intheplots. Still, 0 and 1 donotlookliketheyhaveconverged. Figure6.10illuminatessomeoftheproblems.Inparticular, 0 and 1 seemtobe linearlyrelated,asdo 0 and 1 .Thisisoftenthecaseinregressionproblems;andwe haveseenitbeforeforthepineconesinFigure3.15.Inthecurrentsettingitmeansthat p ~ j y 1 ;:::;y n hasridges:onealongalineinthe 0 ; 1 planeandanotheralonga lineinthe 0 ; 1 plane.

PAGE 377

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 364 Figure6.9:TraceplotsofMCMCoutputfromthepineconecodewithproposal function g.one and100,000iterations.Theplotsshowevery10'thiteration.

PAGE 378

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 365 Figure6.10:PairsplotsofMCMCoutputfromthepineconesexample.

PAGE 379

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 366 Figure6.10wasproducedbythefollowingsnippet. plotem<-seq10000,100000,by=10 pairsmcmc.out[plotem,],pch=".", labels=cnamesparams,"density" Asthegureshows,ittooktherst10,000iterationsorsofor 0 and 1 toreacha roughlysteadystateandfor p ~ toclimbtoareasonablylargevalue.Ifthoseiterations wereincludedinFigure6.10,thepointsafteriteration10,000wouldbesquashedtogether inasmallregion.Thereforewemade plotem<-seq10,000,100000,by=10 todroptherst9999iterationsfromtheplots. IfourMCMCalgorithmproposesamovealongtheridge,theproposalislikelytobe accepted.Butifthealgorithmproposesamovethattakesusotheridge,theproposal islikelytoberejectedbecause p wouldbesmallandthereforetheacceptanceratiowould besmallotheridge.Butthat'snothappeninghere:ourMCMCalgorithmseemsnotto bestuck,sowesurmisethatitisproposingmovesthataresmallcomparedtothewidths oftheridges.However,becausetheproposalsaresmall,thechaindoesnotexplorethe spacequickly.That'swhy 0 and 1 appearnottohavereachedasteadystate.We couldimprovethealgorithmbyproposingmovesthatareroughlyparalleltotheridges. AndwecandothatbymakingmultivariateNormalproposalswithacovariancematrix thatapproximatestheposteriorcovarianceoftheparameters.We'lldothatbynding thecovarianceofthesampleswe'vegeneratedandusingitasthecovariancematrixof ourproposaldistribution.The R codeis Sig<-covmcmc.out[10000:100000,-7] g.group<-functionparams{ proposed<-mvrnorm1,mu=params,Sigma=Sig returnlistproposed=proposed,ratio=1 } Wedroptherst9999iterationsbecausetheyseemnottoreect p ~ accurately.Then wecalculatethecovariancematrixofthesamplesfromthepreviousMCMCsampler.That covariancematrixisusedintheproposalfunction.TheresultsareshowninFigures6.11 and6.12.Figure6.11showsthatthesamplerseemstohaveconvergedaftertherst severalthousanditerations.Theposteriordensityhasrisentoahighlevelandishovering there;allsixvariablesappeartobemixingwell.Figure6.12conrmsourearlierimpression

PAGE 380

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 367 thattheposteriordensityseemstobeapproxiatelyNormalatleast,ithasNormallookingtwodimensionalmarginalswith 0 and 1 highlycorrelatedwitheachother, 1 and 2 highlycorrelatedwitheachother,andnootherlargecorrelations.Thesampler seemstohavefoundonemodeandtobeexploringitwell. Figures6.11and6.12wereproducedwiththefollowingsnippet. plotem<-seq1,100000,by=10 parmfrow=c,2,mar=c,4,1,1+.1 foriin1:6 plotmcmc.out[plotem,i],ylab=namesparams[i],pch="." plotmcmc.out[plotem,7],ylab=expressionptheta,pch="." plotem<-seq1000,100000,by=10 pairsmcmc.out[plotem,],pch=".", labels=cnamesparams,"density" Nowthatwehaveagoodsetofsamplesfromtheposterior,wecanuseittoanswersubstantivequestions.Forinstance,wemightwanttoknowwhethertheextra atmosphericCO 2 hasallowedpinetreestoreachsexualmaturityatanearlierageorto producemorepinecones.Thisisaquestionofwhether 2 and 2 arepositive,negative, orapproximatelyzero.Figure6.13showstheanswerbyplottingtheposteriordensities of 2 and 2 .Bothdensitiesputalmostalltheirmassonpositivevalues,indicating that P[ 2 > 0] and P[ 2 > 0] arebothverylarge,andthereforethatpinestreeswith excessCO 2 matureearlierandproducemoreconesthanpinetreesgrownundernormal conditions. Figure6.13wasproducedbythefollowingsnippet. parmfrow=c,2 plotdensitymcmc.out[10000:100000,"b2"], xlab=expressionbeta[2], ylab=expressionpbeta[2],main="" plotdensitymcmc.out[10000:100000,"g2"], xlab=expressiongamma[2], ylab=expressionpgamma[2],main=""

PAGE 381

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 368 Figure6.11:TraceplotsofMCMCoutputfromthepineconecodewithproposal function g.group and100,000iterations.Theplotsshowevery10'thiteration.

PAGE 382

6.2.METROPOLIS,METROPOLIS-HASTINGS,ANDGIBBS 369 Figure6.12:PairsplotsofMCMCoutputfromthepineconesexamplewithproposal g.group .

PAGE 383

6.3.EXERCISES 370 Figure6.13:Posteriordensityof 2 and 2 fromExample6.2. 6.3Exercises 1.ThisexerciseasksyoutoenhancethecodefortheBe ; 2 exampleonpage352. aHowmanysamplesisenough?Insteadof10,000,trydifferentnumbers. HowfewsamplescanyougetawaywithandstillhaveanadequateapproximationtotheBe ; 2 distribution?Youmustdecidewhat"adequate"means;youcanuseeitherarmorfuzzydenition.Illustrate yourresultswithguressimilarto6.4. bTryanindependencesamplerintheBe ; 2 exampleonpage352.Replacetheproposalkernelwith U ; 1 .Runthesampler,makea guresimilartoFigure6.4anddescribetheresult. cDoestheproposaldistributionmatter?Insteadofproposingwitharadiusof0.1,trydifferentnumbers.Howmuchdoestheproposalradius matter?Doestheproposalradiuschangeyouranswertopart1a?Illustrateyourresultswithguressimilarto6.4. dTryanon-symmetricproposal.Forexample,youmighttryaproposal distributionofBe ; 1 ,oradistributionthatputs2/3ofitsmasson x i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 )]TJ/F41 11.9552 Tf 12.422 0 Td [(: 1 ;x i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 and1/3ofitsmasson x i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ;x i )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 + : 1 .Illustrateyour

PAGE 384

6.3.EXERCISES 371 resultswithguressimilarto6.4. eWhatwouldhappenifyourproposaldistributionwereBe ; 2 ?How wouldthealgorithmsimplify? 2.aSomeresearchersareinterestedin ,theproportionofstudentswho evercheatoncollegeexams.Theyrandomlysample100studentsand askHaveyouevercheatedonacollegeexam?Naturally,somestudentslie.Let 1 betheproportionofnon-cheaterswholieand 2 bethe proportionofcheaterswholie.Let X bethenumberofstudentswho answeryesandsuppose X =40 i.Createapriordistributionfor 1 ,and 2 .Useyourknowledge guidedbyexperience.Writeaformulaforyourpriorandplotthe marginalpriordensityofeachparameter. ii.Writeaformulaforthelikelihoodfunction ` ; 1 ; 2 iii.Findthem.l.e.. iv.Writeaformulaforthejointposteriordensity p ; 1 ; 2 j X =40 v.Writeaformulaforthemarginalposteriordensity p j X =40 vi.WriteanMCMCsamplertosamplefromthejointposterior. vii.Usethesamplertond p j X =40 .Summarizeyourresults. Includeinformationonhowyouassessedmixingandonwhatyou learnedabout p j X =40 viii.Assessthesensitivityofyourposterior, p j X =40 ,toyourprior for 1 and 2 b Randomizedresponse ThispartoftheexerciseusesideasfromExercises35inChapter1and6inChapter2.Asexplainedthere,researchers willsometimesinstructsubjectsasfollows. Tossacoin,butdon'tshowittome.IfitlandsHeads,answer question a .Ifitlandstails,answerquestion b .Justanswer `yes'or`no'.Donottellmewhichquestionyouareanswering. a Doesyourtelephonenumberendinanevendigit? b Haveyouevercheatedonanexamincollege? Theideaoftherandomizationis,ofcourse,toreducetheincentiveto lie.Nonetheless,studentsmaystilllie. i.Ifabout40studentsanswered`yes'inparta,abouthowmanydo youthinkwillanswer`yes'undertheconditionsofpartb?

PAGE 385

6.3.EXERCISES 372 ii.Repeatpartaundertheconditionsofpartbandwithyourbest guessaboutwhat X willbeundertheseconditions. iii.Assesswhetherresearcherswhoareinterestedin arebetteroff usingtheconditionsofpartaorpartb. 3.Figures6.11and6.12suggestthattheMCMCsamplerhasfoundonemode oftheposteriordensity.Mighttherebeothers?Usethe lik functionand R 's optim functiontondout.Eitherdesignorrandomlygeneratesome startingvaluesYoumustdecideongoodchoicesforeitherthedesignorthe randomization.anduse optim tondamodeofthelikelihoodfunction. Summarizeandreportyourresults. 4.Example6.2showsthat 2 and 2 areverylikelypositive,andthereforethat pinetreeswithextraCO 2 matureearlierandproducemorecones.Buthow muchearlierandhowmanymore? aFindtheposteriormeans E [ 2 j y 1 ;:::;y 2 ] and E [ 2 j y 1 ;:::;y 2 ] approximately,fromtheFiguresinthetext. bSupposetherearethreetreesinthecontrolplotsthathaveprobabilities 0.1,0.5,and0.9ofbeingsexuallymature.Pluggingin E [ 2 j y 1 ;:::;y 2 ] fromthepreviousquestion,estimatetheirprobabilitiesofbeingmature iftheyhadgrownwithexcessCO 2 cIstheplug-inestimatefromthepreviousquestioncorrect?I.e.,does itcorrectlycalculatetheprobabilitythatthosetreeswouldbesexually mature?Explainwhyorwhynot.Ifit'snotcorrect,explainhowto calculatetheprobabilitiescorrectly. 5.InthecontextofExample6.2wemightwanttoinvestigatewhetherthe coefcientofdbhshouldbethesameforcontroltreesandfortreatedtrees. aWritedownamodelenhancingthatonpage346toallowforthepossibilityofdifferentcoefcientsfordifferenttreatments. bWhatpartsofthe R codehavetobechanges? cWritethenewcode. dRunit. eSummarizeandreportresults.Reportanydifcultieswithmodifying andrunningthecode.Sayhowmanyiterationsyouranandhowyou checkedmixing.Alsoreportconclusions:doesitlooklikedifferenttreatmentsneeddifferentcoefcients?Howcanyoutell?

PAGE 386

C HAPTER 7 M ORE M ODELS Thischaptertakesupawidevarietyofstatisticalmodels.Itisbeyondthescope ofthisbooktogiveafulltreatmentofanyoneofthem.Butwehopetointroduce eachmodelenoughsothereadercanseeitwhatsituationsitmightbeuseful,what it'sprimarycharacteristcsare,andhowasimpleanalysismightbecarriedoutin R AmorethoroughtreatmentofmanyofthesemodelscanbefoundinVenablesand Ripley[2002]. 7.1HierarchicalModels Itisoftenusefultothinkofpopulationsashavingsubpopulations,andthoseas havingsubsubpopulations,andsoon.Oneexamplecomesfrom[citeWorsleyet al]whodescribefMRIfunctionalmagneticresonanceimagingexperiments.A subjectisplacedinanMRImachineandsubjectedtoseveralstimuliwhilethe machinemeasurestheamountofoxygenowingtovariouspartsofthebrain. Differentstimuliaffectdifferentpartsofthebrain,allowingscientiststobuildupa pictureofhowthebrainworks.Letthegenericparameter bethechangeinblood owtoaparticularregionofthebrainunderaparticularstimulus. iscalled an effect .As citation explain, mayvaryfromsubjecttosubject,fromsession tosessionevenforthesamepatient,andfromruntorunevenwithinthesame session.Todescribethesituationfullyweneedthreesubscripts,solet ijk bethe effectinsubject i ,session j ,run k .Forasinglesubject i andsession j therewillbe anoverallaverageeffect;callit ij .Theset f ijk g k willfallaround ij withabitof variationforeachrun k .AssumingNormaldistributions,wewouldwrite f ijk g k j ij ; k i.i.d.N ij ; k 373

PAGE 387

7.2.TIMESERIESANDMARKOVCHAINS 374 Likewise,forasinglesubject i therewillbeanoverallaverageeffect;callit i .Theset f ij g j willfallaround i withabitofvariationforeachsession j Further,each i isassociatedwithadifferentsubjectsotheyarelikedrawsfrom apopulationwithameanandstandarddeviation,say and i .Thusthewhole modelcanbewritten f ijk g k j ij ; k i.i.d.N ij ; k f ij g j j i ; j i.i.d.N i ; j f i g i j ; i i.i.d.N ; i Figure7.1isagraphicalrepresentationofthismodel. prettypicturehere Figure7.1:GraphicalrepresentationofhierarchicalmodelforfMRI Moreexamples: metaanalysis,JackieMohan'sgerminationrecords,Chantal's arabadopsis,CO2uptakefromRandPinheiroandBates,FACEgrowthratesby tree|ring|treatment Isasamplefromonepopulationorseveral?MixturesofNormals.ExtravariationinBinomials,Poisson,etc.Hierarchicalandrandomeffectsmodels.Discrete populations:medicaltrials,differentspecies,locations,subjects,treatments. 7.2TimeSeriesandMarkovChains Figure7.2showssomedatasetsthatcomewith R .Thefollowingdescriptionsare takenfromthe R helppages. Beaver Thedataareasmallpartofastudyofthelong-termtemperaturedynamics ofbeaver Castorcanadensis innorth-centralWisconsin.Bodytemperature wasmeasuredbytelemetryevery10minutesforfourfemales,butdatafrom oneperiodoflessthanadayisshownhere. MaunaLoa MonthlyatmosphericconcentrationsofCO 2 areexpressedinpartsper millionppmandreportedinthepreliminary1997SIOmanometricmole fractionscale. DAX ThedataarethedailyclosingpricesofGermany'sDAXstockindex.Thedata aresampledinbusinesstime;i.e.,weekendsandholidaysareomitted.

PAGE 388

7.2.TIMESERIESANDMARKOVCHAINS 375 UKLungDisease Thedataaremonthlydeathsfrombronchitis,emphysemaand asthmaintheUK,19741979. CanadianLynx Thedataareannualnumbersoflynxtrappingsfor18211934 inCanada. Presidents ThedataareapproximatelyquarterlyapprovalratingforthePresidentoftheUnitedstatesfromtherstquarterof1945tothelastquarterof 1974. UKdrivers ThedataaremonthlytotalsofcardriversinGreatBritainkilledor seriouslyinjuredJan1969toDec1984.Compulsorywearingofseatbelts wasintroducedon31Jan1983. SunSpots Thedataaremonthlynumbersofsunspots.TheycomefromtheWorld DataCenter-C1ForSunspotIndexRoyalObservatoryofBelgium,Av.Circulaire,3,B-1180BRUSSELS http://www.oma.be/KSB-ORB/SIDC/sidc_txt. html Whatthesedatasetshaveincommonisthattheywereallcollectedsequentiallyin time.Suchdataareknownas timeseriesdata .Becauseeachdatapointisrelatedto theonesbeforeandtheonesafter,theyusuallycannotbetreatedasindependent randomvariables.Methodsforanalyzingdataofthistypearecalled timeseries methods .Moreformally,atimeseriesisasequence Y 1 ;:::;Y T ofrandomvariables indexedbytime.Thegenericelementoftheseriesisusuallydenoted Y t Figure7.2wasproducedbythefollowingsnippet. parmfrow=c,2 plot.tsbeaver1$temp,main="Beaver",xlab="Time", ylab="Temperature" plot.tsco2,main="MaunaLoa",ylab="CO2ppm" plot.tsEuStockMarkets[,1],main="DAX", ylab="ClosingPrice" plot.tsldeaths,main="UKLungDisease", ylab="monthlydeaths" plot.tslynx,main="CanadianLynx",ylab="trappings" plot.tspresidents,main="Presidents",ylab="approval" plot.tsSeatbelts[,"DriversKilled"],main="UKdrivers", ylab="deaths"

PAGE 389

7.2.TIMESERIESANDMARKOVCHAINS 376 Figure7.2: Beaver :Bodytemperatureofabeaver,recordedevery10minutes; MaunaLoa :AtmosphericconcentrationofCO 2 ; DAX :Dailyclosingpricesofthe DAXstockexchangeinGermany; UKLungDisease :monthlydeathsfrombronchitis,emphysemaandasthma; CanadianLynx :annualnumberoftrappings; Presidents :quarterlyapprovalratings; UKdrivers :deathsofcardrivers; SunSpots : monthlysunspotnumbers.

PAGE 390

7.2.TIMESERIESANDMARKOVCHAINS 377 plot.tssunspot.month,main="SunSpots", ylab="numberofsunspots" plot.ts isthecommandforplottingtimeseries. ThedatasetsinFigure7.2exhibitafeaturecommontomanytimeseries:ifone datapointislarge,thenexttendstobelarge,andifonedatapointissmall,the nexttendstobesmall;i.e., Y t and Y t +1 aredependent.Thedependencecanbeseen inFigure7.3whichplots Y t +1 vs. Y t ,fortheBeaverandPresidentdatasets.The upwardtrendineachpanelshowsthedependence.Timeseriesanalyststypically usetheterm autocorrelation theprex auto referstothefactthatthetimeseries iscorrelatedwithitselfeventhoughtheymeandependence. R hasthebuilt-in function acf forcomputingautocorrelations.Thefollowingsnippetshowshowit works. >acfbeaver1$temp,plot=F,lag.max=5 Autocorrelationsofseries'beaver1$temp',bylag 012345 1.0000.8260.6860.5800.4580.342 Thesixnumbersinthebottomlineare Cor Y t ;Y t ; Cor Y t ;Y t +1 ;:::; Cor Y t ;Y t +5 andarereferredtoasautocorrelationsoflag0,lag1,...,lag5.Thoseautocorrelationscan,asusual,bevisualizedwithplotsasinFigure7.4. Figure7.3wasproducedbythefollowingsnippet. dimbeaver1 plotbeaver1$temp[-114],beaver1$temp[-1],main="Beaver", xlab=expressiony[t],ylab=expressiony[t+1] lengthpresidents plotpresidents[-120],presidents[-1],main="Presidents", xlab=expressiony[t],ylab=expressiony[t+1]

PAGE 391

7.2.TIMESERIESANDMARKOVCHAINS 378 Figure7.3: Y t +1 plottedagainst Y t fortheBeaverandPresidentsdatasets Figure7.4wasproducedbythefollowingsnippet. parmfrow=c,2 temp<-beaver1$temp n<-lengthtemp forkin0:5{ x<-temp[1:n-k] y<-temp[1+k:n] plotx,y,xlab=expressionY[t], ylab=expressionY[t+k],main=paste"lag=",k } Becausetimeseriesdatacannotusuallybetreatedasindependent,weneed specialmethodstodealwiththem.Itisbeyondthescopeofthisbooktopresent themajortheoreticaldevelopmentsoftimeseriesmethods.AsFigure7.2shows, therecanbeawidevarietyofstructureintimeseriesdata.Inparticular,the Beaver,andPresidentsdatasetshavenostructurereadilyapparenttotheeye; DAXhasseeminglyminoructuationsimposedonageneralincreasingtrend;UK

PAGE 392

7.2.TIMESERIESANDMARKOVCHAINS 379 Figure7.4: Y t + k plottedagainst Y t fortheBeaverdatasetandlags k =0 ;:::; 5

PAGE 393

7.2.TIMESERIESANDMARKOVCHAINS 380 LungDiseaseandUKdrivershaveanannualcycle;MaunaLoahasanannual cycleimposedandageneralincreasingtrend;andCanadianLynxandSunSpots arecyclic,butfornoobviousreasonandwithnoobviouslengthofthecycle.In theremainderofthissectionwewillshow,byanalyzingsomeofthedatasetsin Figure7.2,someofthepossibilities. Beaver Ourgoalistodevelopamorecompletepictureoftheprobabilisticstructureofthe f Y t g 's.Tothatend,considerthefollowingquestion.Ifwe'retryingto predict Y t +1 ,andifwealreadyknow Y t ,doesithelpusalsotoknow Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ?I.e.,are Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 and Y t +1 conditionallyindependentgiven Y t ?Thatquestioncanbeanswered visuallywithacoplotFigures2.15and2.16.Figure7.5showsthecoplotforthe Beaverdata. Figure7.5wasproducedbythefollowingsnippet. temp<-beaver1$temp n<-lengthtemp coplottemp[3:n]~temp[1:n-2]|temp[2:n-1], xlab=cexpressionY[t-1],expressionY[t], ylab=expressionY[t+1] Thegureisambiguous.Intherst,second,andsixthpanels, Y t +1 and Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 seemtobelinearlyrelatedgiven Y t ,whileinthethird,fourth,andfthpanels, Y t +1 and Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 seemtobeindependentgiven Y t .Wecanexaminethequestion numericallywiththe partialautocorrelation ,theconditionalcorrelationof Y t +1 and Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 given Y t .Thefollowingsnippetshowshowtocomputepartialautocorrelations in R usingthefunction pacf >pacftemp,lag.max=5,plot=F Partialautocorrelationsofseries'temp',bylag 12345 0.8260.0140.031-0.101-0.063 Thenumbersinthebottomroware Cor Y t ;Y t + k j Y t +1 ;:::;Y t + k )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 .Exceptforthe rst,they'resmall.Figure7.5andthepartialautocorrelationssuggestthatamodel inwhich Y t +1 ? Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 j Y t wouldtthedatawell.AndtherstpanelinFigure7.4

PAGE 394

7.2.TIMESERIESANDMARKOVCHAINS 381 Figure7.5:coplotof Y t +1 asafunctionof Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 given Y t fortheBeaverdataset

PAGE 395

7.2.TIMESERIESANDMARKOVCHAINS 382 suggeststhatamodeloftheform Y t +1 = 0 + 1 Y t + t +1 mighttwell.Sucha modeliscalledan autoregression R hasafunction ar forttingthem.Here'show itworkswiththeBeaverdata. >fit<-arbeaver1$temp,order.max=1 >fit#seewhatwe'vegot Call: arx=beaver1$temp,order.max=1 Coefficients: 1 0.8258 Orderselected1sigma^2estimatedas0.01201 The0.8258meansthatthettedmodelis Y t +1 = 0 +0 : 8258 Y t + t +1 .The t 's haveanestimatedvarianceof0.012. fit$x.mean showsthat ^ 0 =36 : 86 .Finally, qqnormfit$resid Tryit.showsanearlylinearplot,exceptforonepoint,indicatingthat Y t +1 N : 86+ : 8258 Y t ; p : 012 isareasonablygoodmodel,exceptfor oneoutlier. MaunaLoa TheMaunaLoadatalooklikeanannualcyclesuperimposedona steadilyincreasinglongtermtrend.Ourgoalistoestimatebothcomponentsand decomposethedataas Y t = longtermtrend + annualcycle + unexplainedvariation : Ourstrategy,becauseitseemseasiest,istoestimatethelongtermtrendrst,then usedeviationsfromthelongtermtrendtoestimatetheannualcycle.Asensible estimateofthelongtermtrendattime t istheaverageofayear'sCO 2 readings, forayearcenteredat t .Thus,let ^ g t = : 5 y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(6 + y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(5 + + y t +5 + : 5 y t +6 12 .1 where g t representsthelongtermtrendattime t R hasthebuilt-incommand filter tocompute ^ g .TheresultisshowninFigure7.6 a whichalsoshowshowto use filter .Deviationsfrom ^ g are co2-g.hat .SeeFigure7.6 b .Thedeviations canbegroupedbymonth,thenaveraged.TheaverageoftheJanuarydeviations,

PAGE 396

7.2.TIMESERIESANDMARKOVCHAINS 383 forexample,isagoodestimateofhowmuchtheJanuaryCO 2 deviatesfromthe longtermtrend,andlikewiseforothermonths.SeeFigure7.6 c .Finally,Figure7.6 d showsthedata, ^ g ,andthettedvalues ^ g + monthlyeffects.Thetis good:thettedvaluesdifferverylittlefromthedata. Figure7.6wasproducedbythefollowingsnippet. filt<-c.5,rep,11,.5/12 g.hat<-filterco2,filt parmfrow=c,2 plot.tsco2,main="a" linesg.hat resids<-co2-g.hat plot.tsresids,main="b" resids<-matrixresids,nrow=12 cycle<-applyresids,1,mean,na.rm=T plotcycle,type="b",main="c" plot.tsco2,type="p",pch=".",main="d" linesg.hat linesg.hat+cycle DAX Y t istheclosingpriceoftheGermanstockexchangeDAXonday t .Investors oftencareabouttherateofreturn Y t = Y t +1 =Y t ,sowe'llhavetoconsiderwhether toanalyzethe Y t 'sdirectly,orconvertthemto Y t 'srst.Figure7.7isfortheDAX pricesdirectly.Panel a showsthe Y t 's.Itseemstoshowminoructuationsaround asteadilyincreasingtrend.Panel b showsthetimeseriesof Y t )]TJ/F41 11.9552 Tf 10.914 0 Td [(Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 .Itseemsto showaseriesofuctuationsapproximatelycenteredaround0,withnoapparent pattern,andwithlargeructuationsoccuringlaterintheseries.Panel c shows Y t versus Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 .Itshowsastronglinearrelationshipbetween Y t and Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 .Two linesaredrawnontheplot:thelines Y t = 0 + 1 Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 for 0 ; 1 = ; 1 and for 0 ; 1 setequaltotheordinaryregressioncoefcientsfoundby lm .Thetwo linesareindistinguishable,suggestingthat Y t Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 isagoodmodelforthedata. Panel d isaQ-Qplotof Y t )]TJ/F41 11.9552 Tf 12.088 0 Td [(Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 .Itisnotapproximatelylinear,suggestingthat Y t N Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ; isnotagoodmodelforthedata.

PAGE 397

7.2.TIMESERIESANDMARKOVCHAINS 384 Figure7.6: a :CO 2 and ^ g ; b :residuals; c :residualsaveragedbymonth; d : data, ^ g ,andttedvalues

PAGE 398

7.2.TIMESERIESANDMARKOVCHAINS 385 Figure7.7:DAXclosingprices. a :thetimeseriesof Y t 's; b : Y t )]TJ/F41 11.9552 Tf 12.22 0 Td [(Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ; c : Y t versus Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ; d :QQplotof Y t )]TJ/F41 11.9552 Tf 11.956 0 Td [(Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 .

PAGE 399

7.2.TIMESERIESANDMARKOVCHAINS 386 Figure7.7wasproducedbythefollowingsnippet. parmfrow=c,2 plot.tsDAX,main="a" plot.tsdiffDAX,ylab=expressionDAX[t]-DAX[t-1], main="b" plotDAX[-n],DAX[-1],xlab=expressionDAX[t-1], ylab=expressionDAX[t],main="c" abline0,1 ablinelmDAX[-1]~DAX[-n]$coef,lty=2 qqnormdiffDAX,main="d" The R command diff isfortakingdifferences,typicallyoftimeseries. diffy yields y[2]-y[1],y[3]-y[2],... whichcouldalsobeaccomplishedeasilyenoughwithoutusing diff : y[-1]-y[-n] .Butadditional arguments,asin diffy,lag,differences makeitmuchmoreuseful.Forexample, diffy,lag=2 yields y[3]-y[1],y[4]-y[2],... while diffy,differences=2 isthesameas diffdiffy .Thelatteris aconstructveryusefulintimeseriesanalysis. Figure7.8isforthe Y t 's.Panel a showsthetimeseries.Itshowsaseemingly patternlesssetofdatacenteredaround1.Panel b showsthetimeseriesof Y t )]TJ/F41 11.9552 Tf -419.1 -14.446 Td [(Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ,aseeminglypatternlesssetofdatacenteredat0.Panel c shows Y t versus Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 .Itshowsnoapparentrelationshipbetween Y t and Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ,suggestingthat Y t ? Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 isagoodmodelforthedata.Panel d isaQ-Qplotof Y t .Itisapproximately linear,suggestingthat Y t N ; isagoodmodelforthedata,withafew outliersonboththehighandlowends.ThemeanandSDofthe Y 'sareabout 1.000705and0.01028;so Y N : 0007 ; 0 : 01 mightbeagoodmodel. Figure7.8wasproducedbythefollowingsnippet. parmfrow=c,2 plot.tsrate,main="a" plot.tsdiffrate,ylab=expressionrate[t]-rate[t-1], main="b" plotrate[-n2],rate[-1],xlab=expressionrate[t-1], ylab=expressionrate[t],main="c" qqnormrate,main="d"

PAGE 400

7.2.TIMESERIESANDMARKOVCHAINS 387 Figure7.8:DAXreturns. a :thetimeseriesof Y t 's; b : Y t )]TJ/F41 11.9552 Tf 10.862 0 Td [(Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ; c : Y t versus Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 ; d :QQplotof Y t .

PAGE 401

7.3.CONTINGENCYTABLES 388 WenowhavetwopossiblemodelsfortheDAXdata: Y t Y t )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 withastill tobedetermineddistributionand Y N : 0007 ; 0 : 01 withthe Y t 'smutually independent.Bothseemplausibleonstatisticalgrounds.ButseeExercise5for furtherdevelopment.Itisnotnecessarytochooseoneortheother.Having severalwaysofdescribingadatasetisuseful.Eachmodelgivesusanotherwayto viewthedata.Economistsandinvestorsmightpreferoneortheotheratdifferent timesorfordifferentpurposes.Theremightevenbeotherusefulmodelsthatwe haven'tyetconsidered.Thosewouldbebeyondthescopeofthisbook,butcould becoveredintextsontimeseries,nancialmathematics,econometrics,orsimilar topics. populationmatrixmodels Example7.1 FACE Richter'sthroughfalldata Example7.2 hydrology Jagdish'sdata.Isthisexampletoocomplicated? 7.3ContingencyTables loglinearmodels?Simpson'sparadox?Censustablesasexamples? 7.4Survivalanalysis Inmanystudies,therandomvariableisthetimeatwhichaneventoccurs.For example, medicine Thetimeuntilapatientdies. neurobiology Thetimeuntilaneuronres. qualitycontrol Thetimeuntilacomputercrashes. highereducation ThetimeuntilanAssociatedProfessorispromotedtoFullProfessor. Suchdataarecalled survivaldata .Forthe i 'thperson,neuron,computer,etc.,there isarandomvariable y i = timeofeventon i 'thunit :

PAGE 402

7.4.SURVIVALANALYSIS 389 Weusuallycall y i the lifetime ,eventhoughtheeventisnotnecessarilydeath.It isoftenthecasewithsurvivaldatathatsomemeasurementsare censored .For example,ifwestudyauniversity'srecordstoseehowlongittakestogetpromoted fromAssociatetoFullProfessor,wewillndsomeAssociateProfessorsleavethe universityeitherthroughretirementorbytakinganotherjobbeforetheyget promotedwhileothersarestillAssociateProfessorsatthetimeofourstudy.For thesepeoplewedon'tknowtheirtimeofpromotion.Ifeitheraperson i leftthe universityafterveyears,orbperson i becameAssociateProfessorveyears priortoourstudy,thenwedon'tknow y i exactly.Allweknowis y i > 5 .This formofcensoringiscalled rightcensoring .Insomedatasetstheremayalsobe leftcensoring or intervalcensoring .Survivalanalysistypicallyrequiresspecialized statisticaltechniques. R hasapackageoffunctionsforthispurpose;thenameof thepackageis survival .The survival packageisautomaticallydistributedwith R .Toloaditintoyour R session,type librarysurvival .Thepackagecomes withfunctionsforsurvivalanalysisandalsowithsomeexampledatasets.Our nextexampleusesoneofthosedatasets. Example7.3 BladderTumors Thisexamplecomesfromastudyofbladdertumors,originallypublishedinByar[1980] andlaterreanalyzedinWeietal.[1989].Patientshadbladdertumors.Thetumors wereremovedandthepatientswererandomlyassignedtooneofthreetreatmentgroups placebo,thiotepa,pyridoxine.Thenthepatientswerefollowedthroughtimetosee whetherandwhenbladdertumorswouldrecur. R 's survival packagehasthedata forthersttwotreatmentgroups,placeboandthiotepa.Type bladder toseeit. Remembertoloadthe survival packagerst.Thelastseverallineslooklikethis. idrxnumbersizestopeventenum 341832345401 342832345402 343832345403 344832345404 345842213811 346842215402 347842215403 348842215404 349852135901 350852135902 351852135903 352852135904

PAGE 403

7.4.SURVIVALANALYSIS 390 id isthepatient'sidnumber.Notethateachpatienthasfourlinesofdata.That's torecorduptofourrecurrencesoftumor. rx isthetreatment:1forplacebo;2forthiotepa. number isthenumberoftumorsthepatienthadattheinitialexam,whenthe patientjoinedthestudy. size isthesizecmofthelargestinitialtumor. stop isthetimemonthsoftheobservation. event is1ifthere'satumor;0ifnot. enum line1,2,3,or4foreachpatient Forexample,patient83wasfollowedfor54monthsandhadnotumorrecurrences; patient85wasfollowedfor59monthsandalsohadnorecurrences.Butpatient84,who wasalsofollowedfor54months,hadatumorrecurrenceatmonth38andnofurther recurrencesafterthat.Ouranalysiswilllookatthetimeuntiltherstrecurrence,sowe want bladder[bladder$enum==1,] ,thelastseverallinesofwhichare idrxnumbersizestopeventenum 329802334901 333812115001 33782241411 341832345401 345842213811 349852135901 Patients80,81,83,and85hadnotumorsforaslongastheywerefollowed;theirdata isright-censored.Thedataforpatients82and84isnotcensored;itisobservedexactly. Figure7.9isaplotofthedata.Thesolidlineisforthiotepa;thedashedlinefor placebo.Theabscissaisinmonths.Theordinateshowsthefractionofpatientswhohave survivedwithoutarecurrenceofbladdertumors.Theplotshows,forexample,thatat30 months,thesurvivalratewithoutrecurrenceisabout50%forthiotepapatientscompared toalittleunder40%forplacebopatients.Thecirclesontheplotshowcensoring.I.e., thefourcirclesonthesolidcurvebetween30and40monthsrepresentfourplacebo patientswhosedatawasright-censored.Thereisacircleateverycensoringtimethatis notalsothetimeofarecurrenceforadierentpatient.

PAGE 404

7.4.SURVIVALANALYSIS 391 Figure7.9:Survivalcurveforbladdercancer.Solidlineforthiotepa;dashedline forplacebo.

PAGE 405

7.4.SURVIVALANALYSIS 392 Figure7.9wasproducedwiththesnippet event.first<-bladder[,"enum"]==1 blad.surv<-Survbladder[event.first,"stop",], bladder[event.first,"event"] blad.fit<-survfitblad.surv~bladder[event.first,"rx"] plotblad.fit,conf.int=FALSE,mark=1,xlab="months", ylab="fractionwithoutrecurrence",lty=1:2 Surv is R 'sfunctionforcreatingasurvivalobject.Youcantype printblad.surv and summaryblad.surv tolearnmoreaboutsurvivalobjects. survfit computesanestimateofasurvivalcurve. Insurvivalanalysiswethinkof y 1 ;:::;y n asasamplefromadistribution F withdensity f ,oflifetimes.Insurvivalanalysis,statisticiansoftenworkwiththe survivorfunction S t =1 )]TJ/F41 11.9552 Tf 12.431 0 Td [(F t =P[ y i >t ] ,theprobabilitythataunitsurvives beyondtime t .ThelinesinFigure7.9aretheso-called Kaplan-Meier estimatesof S forpatientsinthethiotepaandplacebogroups,whicharisefromthefollowing argument.Partition R + intointervals ;t 1 ] t 1 ;t 2 ] ,...,andlet p i =P[ y>t i j y> t i )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ] .Thenforeach i S i = Q i j =1 p i .The p i 'scanbeestimatedfromdataas ^ p i = r t i )]TJ/F41 11.9552 Tf 12.387 0 Td [(d i =r t i where r t i isthenumberofpeopleatriskinthestudybutnot yetdeadattime t i )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 and d i isthenumberofdeathsintheinterval t i )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ;t i ] .Thus, ^ S t i = Q i j =1 r t i )]TJ/F41 11.9552 Tf 12.686 0 Td [(d i =r t i .Asthepartitionbecomesner,mosttermsinthe productareequaltoone;onlythoseintervalswithadeathcontributeatermthat isnotone.ThelimityieldstheKaplan-Meierestimate ^ S t = Y i : y i max f y i g Survivaldataisoftenmodelledintermsofthehazardfunction h t =lim h 0 P[ y 2 [ t;t + h j y t ] h =lim h 0 P[ y 2 [ t;t + h ] h P[ y t ] = f t S t : .2

PAGE 406

7.4.SURVIVALANALYSIS 393 Theinterpretationof h t isthefraction,amongpeoplewhohavesurvivedtotime t ,ofthosewhowilldiesoonthereafter.Thereareseveralparametricfamilies ofdistributionsforlifetimesinuseforsurvivalanalysis.Themostbasicisthe exponential f y = = exp )]TJ/F42 7.9701 Tf 6.586 0 Td [(y= whichhashazardfunction h y =1 = ,a constant.Aconstanthazardfunctionsays,forexample,thatyoungpeoplearejust aslikelytodieasoldpeople,orthatnewairconditionersarejustaslikelytofail asoldairconditioners.Formanyapplicationsthatassumptionisunreasonable,so statisticiansmayworkwithotherparametricfamiliesforlifetimes,especiallythe Weibull,whichhas h y = y )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 ,anincreasingfunctionof y if y> 1 .We willnotdwellfurtheronparametricmodels;theinterestedreadershouldreferto amorespecializedsource. However,thegoalofsurvivalanalysisisnotusuallytoestimate S and h ,but tocomparethesurvivorandhazardfunctionsfortwogroupssuchastreatment andplaceboortoseehowthesurvivorandhazardfunctionsvaryasfunctionsof somecovariates.Thereforeitisnotusuallynecessarytoestimate S and h well,as longaswecanestimatehow S and h differbetweengroups,orasafunctionofthe covariates.Forthispurposeithasbecomecommontoadopta proportionalhazards model: h y = h 0 y exp 0 x .3 where h 0 isthebaselinehazardfunctionthatisadjustedaccordingto x ,avector ofcovariatesand ,avectorofcoefcients.Equation7.3isknownastheCoxproportionalhazardsmodel.Thegoalisusuallytoestimate R 's survival package hasafunctionforttingEquation7.3todata. Example7.4 BladderTumors,cont. ThiscontinuesExample7.3.HereweadopttheCoxproportionalhazardsmodeland seehowwellwecanestimatetheeectofthetreatmentthiotepacomparedtoplacebo inpreventingrecurrenceofbladdertumors.Wewillalsoexaminetheeectsofother potentialcovariates. We'dliketottheCoxproportionalhazardsmodel h y = h 0 y exp trt trt wheretrtisanindicatorvariablethatis1forpatientsonthiotepaand0forpatients onplacebo;butrstwecheckwhethersuchamodellooksplausible;i.e.whetherthe hazardslookproportional.StartingfromEquation7.2wecanintegratebothsidestoget H y R y 0 h z dz = )]TJ/F15 11.9552 Tf 11.291 0 Td [(log S y H y iscalledthe cumulativehazard function.Thus, ifthetwogroupshaveproportionalhazards,theyalsohaveproportionalcumulativehazard functionsandlogsurvivorfunctions.Figure7.10plotstheestimatedcumulativehazard andlogcumulativehazardfunctionsforthebladdertumordata.Thelogcumulative hazardfunctionslookparallel,sotheproportionalhazardsassumptionlooksreasonable.

PAGE 407

7.4.SURVIVALANALYSIS 394 Figure7.10:Cumulativehazardandloghazardcurvesforbladdercancer.Solid lineforthiotepa;dashedlineforplacebo. Figure7.10wasproducedwiththesnippet plotblad.fit,conf.int=FALSE,mark=1,xlab="months", ylab="cumulativehazard",lty=1:2,fun="cumhaz" plotblad.fit,conf.int=FALSE,mark=1,xlab="months", ylab="logcumulativehazard",lty=1:2,fun="cloglog" The fun argumentallowstransformationsofthesurvivalcurve. fun="cumhaz" plotsthecumulativehazardfunctionand fun="cloglog" plotsthelogcumulative hazardfunction.

PAGE 408

7.5.THEPOISSONPROCESS 395 Sincetheproportionalhazardsmodellooksreasonable,wetit: blad.cox<-coxphblad.survbladder[event.first,"rx"] Printing blad.cox yields Call: coxphformula=blad.survbladder[event.first,"rx"] coefexpcoefsecoefzp bladder[event.first,"rx"]-0.3710.690.303-1.220.22 Likelihoodratiotest=1.54on1df,p=0.215n=85 Theestimatedcoecientis ^ trt = )]TJ/F15 11.9552 Tf 9.298 0 Td [(0 : 371 .Thusthehazardfunctionforthiotepapatients isestimatedtobe exp )]TJ/F15 11.9552 Tf 9.299 0 Td [(0 : 371=0 : 69 timesthatforplacebopatients.Thestandard errorof ^ trt = )]TJ/F15 11.9552 Tf 9.298 0 Td [(0 : 371 isabout0.3;so ^ trt isaccuratetoabout 0 : 6 orso. 7.5ThePoissonprocess Example7.5 Earthquakes Example7.6 Neuronsring 7.6Changepointmodels Example7.7 Everglades

PAGE 409

7.7.SPATIALMODELS 396 7.7Spatialmodels 7.8PointProcessModels 7.9Evaluatingandenhancingmodels residuals,RSS,pred.vs.obs.,QQplots,chi-square,AIC,BIC,DIC,SE'sofcoefcients,testingnestedmodels,others? 7.10Exercises 1.aMakeplotsanalagoustoFigures7.3and7.4,computeautocorrelations, andinterpretfortheotherdatasetsinFigure7.2. bMakeplotsanalagoustoFigure7.5,computepartialautocorrelations, andinterpretfortheotherdatasetsinFigure7.2. 2.Createandtagoodmodelforbodytemperaturesofthesecondbeaver.Use thedataset beaver2 3.aWhydoesEquation7.1averageoverayear?Whyisn'tit,forexample, ^ g t = : 5 y t )]TJ/F42 7.9701 Tf 6.586 0 Td [(k + y t )]TJ/F42 7.9701 Tf 6.587 0 Td [(k +1 + + y t + k )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 + : 5 y t + k 2 k forsome k 6 =6 ? bExamine ^ g inEquation7.1.Use R ifnecessary.Whyaresomeofthe entries NA ? 4.The R codeforFigure7.6containsthelines resids<-matrixresids,nrow=12 cycle<-applyresids,1,mean,na.rm=T Wouldthefollowinglinesworkinstead? resids<-matrixresids,ncol=12 cycle<-applyresids,2,mean,na.rm=T

PAGE 410

7.10.EXERCISES 397 Whyorwhynot? 5.Figure7.7andtheaccompanyingtextsuggestthat Y t Y t )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 isagoodmodel fortheDAXdata.Butthatdoesn'tsquarewiththeobservationthatthe Y t 's haveagenerallyincreasingtrend. aFindaquantitativewaytoshowthetrend. bSaywhytheDAXanalysismissedthetrend. cImprovetheanalysissoit'sconsistentwiththetrend. 6.Figures7.7and7.8andtheaccompanyingtextanalyzetheDAXtimeseries asthoughithasthesamestructurethroughouttheentiretime.Doesthat makesense?Thinkofandimplementsomewayofinvestigatingwhetherthe structureoftheserieschangesfromearlytolate. 7.ChooseoneormoreoftheotherEUStockMarketsthatcomewiththeDAX data.InvestigatewhetherithasthesamestructureastheDAX. 8.aMakeaplausibleanalysisoftheUKLungDiseasedata. b R hasthethreedatasets ldeaths fdeaths ,and mdeaths whicharethe totaldeaths,thedeathsoffemales,andthedeathsofmales.Dothe deathsoffemalesandmalesfollowsimilardistributionalpatterns?Justifyyouranswer. 9.MakeaplausibleanalysisofthePresidentsapprovalratings. 10.aMakeaplausibleanalysisoftheUKdriversdeaths. bAccordingto R ,Compulsorywearingofseatbeltswasintroducedon31 Jan1983.Didthateffectthenumberofdeaths?Justifyyouranswer. cIsthenumberofdeathsrelatedtothenumberofkilometersdriven? Usethevariable kms inthe Seatbelts dataset.Justifyyouranswer. 11.ThisquestionfollowsupExample7.4.Intheexampleweanalyzedthedatato learntheeffectofthiotepaontherecurrenceofbladdertumors.Butthedata sethastwoothervariablesthatmightbeimportantcovariates:thenumber ofinitialtumorsandthesizeofthelargestinitialtumor. aFindthedistributionofthenumbersofinitialtumors.Howmanypatientshad1initialtumor,howmanyhad2,etc?

PAGE 411

7.10.EXERCISES 398 bDividepatients,inasensibleway,intogroupsaccordingtothenumber ofinitialtumors.Youmustdecidehowmanygroupsthereshouldbe andwhatthegroupboundariesshouldbe. cMakeplotssimilartoFigures7.9and7.10toseewhetheraproportional hazardmodellookssensiblefornumberofinitialtumors. dFitaproportionalhazardmodelandreporttheresults. eRepeatthepreviousanalysis,butforsizeoflargestinitialtumor. fFitaproportionalhazardmodelwiththreecovariates:treatment,numberofinitialtumors,sizeoflargestinitialtumor.Reporttheresults.

PAGE 412

C HAPTER 8 M ATHEMATICAL S TATISTICS 8.1PropertiesofStatistics 8.1.1Sufciency Considerthefollowingtwofacts. 1.Let Y 1 ;:::;Y n i.i.d.Poi .Chapter2,Exercise7showedthat ` depends onlyon P Y i andnotonthespecicvaluesoftheindividual Y i 's. 2.Let Y 1 ;:::;Y n i.i.d.Exp .Chapter2,Exercise20showedthat ` dependsonlyon P Y i andnotonthespecicvaluesoftheindividual Y i 's. Further,since ` quantieshowstronglythedatasupporteachvalueof ,other aspectsof y areirrelevant.Forinferenceabout itsufcestoknow ` ,and therefore,forPoissonandExponentialdata,itsufcestoknow P Y i .Wedon't needtoknowtheindividual Y i 's.Wesaythat P Y i isasufcientstatisticfor Section8.1.1examinesthegeneralconceptofsufciency.Weworkinthecontextofaparametricfamily.TheideaofsufciencyisformalizedinDenition8.1. Denition8.1. Let f p j g beafamilyofprobabilitydensitiesindexedbyaparameter .Let y = y 1 ;:::;y n beasamplefrom p j forsomeunknown .Let T y beastatisticsuchthatthejointdistributionfactorsas Y p y i j = g T y ; h y : forsomefunctions g and h .Then T iscalleda sufcientstatisticfor 399

PAGE 413

8.1.PROPERTIESOFSTATISTICS 400 Theideaisthatoncethedatahavebeenobserved, h y isaconstantthatdoes notdependof ,so ` / Q p y i j = g T; h y / g T; .Therefore,inorder toknowthelikelihoodfunctionandmakeinferenceabout ,weneedonlyknow T y ,notanythingelseabout y .ForourPoissonandExponentialexampleswecan take T y = P y i Foramoredetailedlookatsufciency,thinkofgeneratingthreeBern trials y y 1 ;y 2 ;y 3 y canbegenerated,obviously,bygenerating y 1 ;y 2 ;y 3 sequentially. Thepossibleoutcomesandtheirprobabilitiesare ; 0 ; 0 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 3 ; 0 ; 0 ; 1 ; 0 )]TJ/F41 11.9552 Tf 11.955 0 Td [( 2 ; 0 ; 1 ; 1 ; 0 ; 0 ; 1 2 )]TJ/F41 11.9552 Tf 11.955 0 Td [( ; 1 ; 1 ; 1 ; 1 3 But y canalsobegeneratedbyatwo-stepprocedure: 1.Generate P y i =0 ; 1 ; 2 ; 3 withprobabilities )]TJ/F41 11.9552 Tf 11.291 0 Td [( 3 3 )]TJ/F41 11.9552 Tf 11.291 0 Td [( 2 3 2 )]TJ/F41 11.9552 Tf 11.291 0 Td [( 3 respectively. 2.aIf P y i =0 ,generate ; 0 ; 0 bIf P y i =1 ,generate ; 0 ; 0 ; 1 ; 0 ,or ; 0 ; 1 eachwithprobability 1/3. cIf P y i =2 ,generate ; 1 ; 0 ; 0 ; 1 ,or ; 1 ; 1 eachwithprobability 1/3. dIf P y i =3 ,generate ; 1 ; 1 Itiseasytocheckthatthetwo-stepproceduregenerateseachofthe8possible outcomeswiththesameprobabilitiesastheobvioussequentialprocedure.For generating y thetwoproceduresareequivalent.Butinthetwo-stepprocedure, onlytherststepdependson .Soifwewanttousethedatatolearnabout ,we needonlyknowtheoutcomeoftherststep.Thesecondstepisirrelevant.I.e., weneedonlyknow P y i .Inotherwords, P y i issufcient.

PAGE 414

8.1.PROPERTIESOFSTATISTICS 401 Foranexampleofanothertype,let y 1 ;:::;y n i.i.d.U ; .Whatisasufcient statisticfor ? p y j = 1 n if y i < for i =1 ;:::;n 0 otherwise = 1 n 1 ; y n showsthat y n ,themaximumofthe y i 's,isaonedimensionalsufcientstatistic for Example8.1 InWorldWarII,whenGermantankscamefromthefactorytheyhadserialnumbers labelledconsecutivelyfrom1.I.e.,thenumberswere1,2,....TheAllieswantedto estimate T ,thetotalnumberofGermantanksandhad,asdata,theserialnumbers ofcapturedtanks.SeeExercise22inChapter5.Assumethattankswerecaptured independentlyofeachotherandthatalltankswereequallylikelytobecaptured.Let x 1 ;:::;x n betheserialnumbersofthecapturedtanks.Then x n isasucientstatistic. InferenceaboutthetotalnumberofGermantanksshouldbebasedon x n andnoton anyotheraspectofthedata. If y isarandomvariablewhosevaluesareinaspace Y ,then y isarandom variablewhosevaluesarein Y n .Foranystatistic T wecandivide Y n intosubsets indexedby T .I.e.,foreachvalue t ,wedenethesubset Y n t = f y 2Y n : T y = t g Then T isasufcientstatisticifandonlyif p y j y 2Y n t doesnotdependon Sometimessufcientstatisticsarehigherdimensional.Forexample,let y 1 ;:::;y n i.i.d.Gam ; .Then Y p y i j ; = Y 1 \050 y )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 i e )]TJ/F42 7.9701 Tf 6.586 0 Td [(y i = = 1 \050 n Y y i )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F47 7.9701 Tf 7.998 5.977 Td [(P y i = so T y = Q y i ; P y i isatwodimensionalsufcientstatistic. Sufcientstatisticsarenotunique.If T = T y isasufcientstatistic,andif f isa1-1function,then f T isalsosufcient.SointhePoisson,Exponential,and

PAGE 415

8.1.PROPERTIESOFSTATISTICS 402 Bernoulliexampleswhere P y i wassufcient, y = P y i =n isalsosufcient.But thelackofuniquenessisevenmoresevere.Thewholedataset T y = y isan n -dimensionalsufcientstatisticbecause Y p y i j = g T y ; h y where g T y ; = p y j and h y =1 .The orderstatistic T y = y ;:::;y n isanother n -dimensionalsufcientstatistic.Also,if T isanysufcientonedimensionalstatisticthen T 2 = y 1 ;T isatwodimensionalsufcientstatistic.Butitisintuitivelyclearthatthesesufcientstatisticsarehigher-dimensionalthannecessary. Theycanbereducedtolowerdimensionalstatisticswhileretainingsufciency,that is,withoutlosinginformation. Thekeyideaintheprecedingparagraphisthatthehighdimensionalsufcient statisticscanbetransformedintothelowdimensionalones,butnot viceversa E.g., y isafunctionof y ;:::;y n but y ;:::;y n isnotafunctionof y .Denition8.2isforstatisticsthathavebeenreducedasmuchaspossiblewithoutlosing sufciency. Denition8.2. Asufcientstatistic T y iscalled minimalsufcient if,forevery othersufcientstatistic T 2 T y isafunctionof T 2 y Thisbookdoesnotdelveintomethodsforndingminimalsufcientstatistics. Inmostcasestheusercanrecognizewhetherastatisticisminimalsufcient. Doesthetheoryofsufciencyimplythatstatisticiansneedlookonlyatsufcient statisticsandnotatotheraspectsofthedata?Notquite.Let y 1 ;:::;y n bebinary randomvariablesandsupposeweadoptthemodel y 1 ;:::;y n i.i.d.Bern .Then forestimating weneedlookonlyat P y i .Butsuppose y 1 ;:::;y n turnouttobe 00 0 | {z } many0's 11 1 | {z } many1's ; i.e.,many0'sfollowedbymany1's.Suchadatasetwouldcastdoubtontheassumptionthatthe y i 'sareindependent.Judgingfromthisdataset,itlooksmuch morelikelythatthe y i 'scomeinstreaks.Sostatisticiansshouldlookatallthedata, notjustsufcientstatistics,becauselookingatallthedatacanhelpuscreateand critiquemodels.Butonceamodelhasbeenadopted,theninferenceshouldbe basedonsufcientstatistics. 8.1.2Consistency,Bias,andMean-squaredError Consistency Heuristicallyspeaking,aswecollectevermoredataweshouldbe abletolearnthetruthevermoreaccurately.Thisheuristiciscapturedformally,

PAGE 416

8.1.PROPERTIESOFSTATISTICS 403 atleastforparameterestimation,bythenotionof consistency .Tosaywhetheran estimatorisconsistentwehavetodeneitforeverysamplesize.Tothatend,let Y 1 ;Y 2 ; i.i.d. f forsomeunknowndensity f havingnitemean andSD Foreach n 2 N let T n : R n R .I.e. T n isareal-valuedfunctionof y 1 ;:::;y n .For example,ifwe'retryingtoestimate wemighttake T n = n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 P n 1 y i Denition8.3. Thesequenceofestimators T 1 ;T 2 ;::: issaidtobe consistentforthe parameter ifforevery andforevery > 0 lim n !1 P[ j T n )]TJ/F41 11.9552 Tf 11.955 0 Td [( j < ]=1 : Forexample,theLawofLargeNumbers,Theorem1.12,saysthesequenceof samplemeans f T n = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 P n 1 y i g isconsistentfor .Similarly,let S n = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 P i y i )]TJ/F41 11.9552 Tf -419.1 -14.446 Td [(T n 2 bethesamplevariance.Then f S n g isconsistentfor 2 .Moregenerally,m.l.e.'s areconsistent. Theorem8.1. Let Y 1 ;Y 2 ; i.i.d. p Y y j andlet ^ n bethem.l.e.fromthesample y 1 ;:::;y n .Further,let g beacontinuousfunctionof .Then,subjecttoregularity conditions, f g ^ n g isaconsistentsequenceofestimatorsfor g Proof. Theproofrequiresregularityconditionsrelatingtodifferentiabilityandthe interchangeofintegralandderivative.Itisbeyondthescopeofthisbook. Consistencyisagoodproperty;oneshouldbewaryofaninconsistentestimator.Ontheotherhand,consistencyalonedoesnotguaranteethatasequenceofestimatorsisoptimal,orevensensible.Forexample,let R n y 1 ;:::;y n = b n= 2 c )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 y 1 + + y b n= 2 c ,themeanofthersthalfoftheobservations. b w c isthe oor of w ,thelargestintegernotgreaterthan w .Thesequence f R n g isconsistent for butisnotasgoodasthesequenceofsamplemeans. Bias Itseemsnaturaltowantthesamplingdistributionofanestimatortobe centeredaroundtheparameterbeingestimated.Thisdesideratumiscaptured formally,atleastforcenteringinthesenseofexpectation,bythenotionof bias Denition8.4. Let ^ = ^ y 1 ;:::;y n beanestimatorofaparameter .Thequantity E [ ^ ] )]TJ/F41 11.9552 Tf 11.956 0 Td [( iscalledthe bias of ^ .Anestimatorwhosebiasis0iscalled unbiased Herearesomeexamples. Anunbiasedestimator Let y 1 ;:::;y n i.i.d.N ; andconsider ^ = y asan estimateof .Because E [ y ]= y isanunbiasedestimateof .

PAGE 417

8.2.TRANSFORMATIONSOFPARAMETERS 404 Abiasedestimator Let y 1 ;:::;y n i.i.d.N ; andconsider ^ 2 = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X y i )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 asanestimateof 2 E [ n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 X y i )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 ]= n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 E [ X y i )]TJ/F41 11.9552 Tf 11.955 0 Td [( + )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 ] = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 n E [ X y i )]TJ/F41 11.9552 Tf 11.956 0 Td [( 2 ]+2 E [ X y i )]TJ/F41 11.9552 Tf 11.955 0 Td [( )]TJ/F15 11.9552 Tf 12.747 0 Td [( y ] + E [ X )]TJ/F15 11.9552 Tf 12.747 0 Td [( y 2 ] o = n )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 n 2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 2 + 2 = 2 )]TJ/F41 11.9552 Tf 11.956 0 Td [(n )]TJ/F39 7.9701 Tf 6.587 0 Td [(1 2 = n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 n 2 Therefore ^ 2 isabiasedestimatorof 2 .It'sbiasis )]TJ/F41 11.9552 Tf 9.298 0 Td [( 2 =n .Somestatisticians prefertousetheunbiasedestimator ~ 2 = n )]TJ/F15 11.9552 Tf 11.956 0 Td [(1 )]TJ/F39 7.9701 Tf 6.586 0 Td [(1 P y i )]TJ/F15 11.9552 Tf 12.748 0 Td [( y 2 Abiasedestimator Let x 1 ;:::;x n i.i.d.U ; andconsider ^ = x n asanestimateof ^ isthem.l.e.;seeSection5.4.But x n < ;therefore E [ x n ] < ; therefore x n isabiasedestimatorof MeanSquaredError 8.1.3Efciency 8.1.4AsymptoticNormality 8.1.5Robustness 8.2TransformationsofParameters Equivalentparameterizations,especiallyANOVA's,etc.InvarianceofMLEs. 8.3Information 8.4MoreHypothesisTesting [amoreformalpresentationhere?]

PAGE 418

8.5.EXPONENTIALFAMILIES 405 8.4.1pvalues 8.4.2TheLikelihoodRatioTest 8.4.3TheChiSquareTest 8.4.4Power 8.5Exponentialfamilies 8.6LocationandScaleFamilies Location/scalefamilies 8.7Functionals functionals 8.8Invariance Invariance 8.9Asymptotics Inreallife,datasetsarenite: y 1 ;:::;y n .YetweoftenappealtotheLawof LargeNumbersortheCentralLimitTheorem,Theorems1.12,1.13,and1.14, whichconcernthelimitofasequenceofrandomvariablesas n !1 .Thehopeis thatwhen n islargethosetheoremswilltellussomething,atleastapproximately, aboutthedistributionofthesamplemean.Butwe'refacedwiththequestions Howlargeislarge?andHowcloseistheapproximation? Totakeanexample,wemightwanttoapplytheLawofLargeNumbersor theCentralLimitTheoremtoasequence Y 1 ;Y 2 ;::: ofrandomvariablesfroma distributionwithmean andSD .Hereareafewinstancesoftherstseveral elementsofsuchasequence.

PAGE 419

8.9.ASYMPTOTICS 406 0.700.290.09-0.23-0.30-0.79-0.72-0.351.79 -0.23-0.240.29-0.160.37-0.01-0.48-0.590.39 -1.10-0.91-0.340.221.07-1.51-0.41-0.650.07 . . . . . . . . . . . . . . . Eachsequenceoccupiesonerowofthearray.The indicatesthatthesequence continuesinnitely.The . .indicatesthatthereareinnitelymanysuchsequences. Thenumbersweregeneratedby y<-matrixNA,3,9 foriin1:3{ y[i,]<-rnorm printroundy[i,],2 } Ichosetogenerate Y i 'sfromtheN ; 1 distribution,soIused rnorm ,andso, forthisexample, =0 and =1 .Thosearearbitrarychoices.Icouldhave usedanyvaluesof and andanydistributionforwhichIknowhowto generaterandomvariablesonthecomputer. round doesrounding.Inthiscasewe'reprintingeachnumberwithtwodecimalplaces. Becausetherearemultiplesequences,eachwithmultipleelements,weneedtwo subscriptstokeeptrackofthingsproperly.Let Y ij bethe j 'thelementofthe i 'thsequence.Forthe i 'thsequenceofrandomvariables,we'reinterestedinthesequence ofmeans Y i 1 ; Y i 2 ;::: where Y in = Y i 1 + + Y in =n .Andwe'realsointerestedin thesequence Z i 1 ;Z i 2 ;::: where Z in = p n Y in )]TJ/F41 11.9552 Tf 12.081 0 Td [( .Forthethreeinstancesabove, the Y in 'sand Z in 'scanbeprintedwith foriin1:3{ printroundcumsumy[i,]/1:9,2 printroundcumsumy[i,]/sqrt:9,2 } cumsum computesacumulativesum;so cumsumy[1,] yieldsthevector y[1,1],y[1,1]+y[1,2],...,y[1,1]+...+y[1,9] .Printout cumsumy[1,] ifyou'renotsurewhatitis.Therefore, cumsumy[i,]/1:9 isthesequenceof Y in 's.

PAGE 420

8.9.ASYMPTOTICS 407 sqrt computesthesquareroot.Sothesecond print statementprintsthe sequenceof Z in 's. Theresultsforthe Y in 'sare 0.700.490.360.210.11-0.04-0.14-0.160.05 -0.23-0.23-0.06-0.080.010.00-0.07-0.13-0.07 -1.10-1.01-0.78-0.53-0.21-0.43-0.43-0.45-0.40 . . . . . . . . . . . . . . . andforthe Z in 'sare 0.700.910.960.840.710.390.11-0.010.59 -0.23-0.40-0.23-0.31-0.15-0.15-0.33-0.54-0.41 -1.10-1.74-1.94-1.83-1.35-1.97-2.12-2.35-2.33 . . . . . . . . . . . . . . . We'reinterestedinthefollowingquestions. 1.Willeverysequenceof Y i 'sor Z i 'sconverge?Thisisaquestionaboutthelimit alongeachrowofthearray. 2.Iftheyconverge,dotheyallhavethesamelimit? 3.Ifnoteverysequenceconverges,whatfractionofthemconverge;orwhatis theprobabilitythatarandomlychosensequenceof Y i 'sor Z i 'sconverges? 4.Foraxed n ,whatisthedistributionof Y n or Z n .Thisisaquestionabout thedistributionalongcolumnsofthearray. 5.Doesthedistributionof Y n or Z n dependon n ? 6.Istherealimitingdistributionas n !1 ? SomesimpleexamplesandtheStrongLawofLargeNumbers,Theorem1.13, answerquestions1,2,and3forthesequencesof Y i 's.TheCentralLimitTheorem, Theorem1.14,answersquestion6forthesequencesof Z i 's. 1.Willeverysequenceof Y i 'sconverge?No.Supposethesequenceof Y i 'sis 1 ; 2 ; 3 ;::: .Then f Y i g increaseswithoutlimitanddoesnotconverge. 2.Iftheyconverge,dotheyhavethesamelimit?No.Herearetwosequences of Y i 's.

PAGE 421

8.9.ASYMPTOTICS 408 111 -1-1-1 Thecorrespondingsequences f Y i g convergetodifferentlimits. 3.Whatistheprobabilityofconvergence?Theprobabilityofconvergenceis1. That'stheStrongLawofLargeNumbers.Inparticular,theprobabilityof randomlygettingasequencelike 1 ; 2 ; 3 ;::: thatdoesn'tconvergeis0.But theStrongLawofLargeNumberssaysevenmore.Itsays P[lim n !1 Y n = ]=1 : Sotheprobabilityofgettingsequenceslike 1 ; 1 ; 1 ;::: or )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 ; )]TJ/F15 11.9552 Tf 9.299 0 Td [(1 ; )]TJ/F15 11.9552 Tf 9.298 0 Td [(1 ;::: that convergestosomethingotherthan is0. 4.Whatisthedistributionof Z n ?Wecannotsayingeneral.Itdependsonthe distributionoftheindividual Y i 's. 5.Doesthedistributionof Z n dependon n ?Yes,exceptinthespecialcase where Y i N ; 1 forall i 6.Istherealimitingdistribution?Yes.That'stheCentralLimitTheorem.Regardlessofthedistributionofthe Y ij 's,aslongas Var Y ij < 1 ,thelimit,as n !1 ,ofthedistributionof Z n isN ; 1 TheLawofLargeNumbersandtheCentralLimitTheoremaretheoremsabout thelimitas n !1 .Whenweusethosetheoremsinpracticewehopethatour samplesize n islargeenoughthat Y in and Z in N ; 1 ,approximately.But howlargeshould n bebeforerelyingonthesetheorems,andhowgoodisthe approximation?Theansweris, Itdependsonthedistributionofthe Y ij 's .That's whatwelookatnext. Toillustrate,wegeneratesequencesof Y ij 'sfromtwodistributions,compute Y in 'sand Z in 'sforseveralvaluesof n ,andcompare.OnedistributionisU ; 1 ;the otherisarecenteredandrescaledversionofBe : 39 ;: 01 TheBe : 39 ;: 01 density,showninFigure8.1,waschosenforitsasymmetry.It hasameanof : 39 =: 40= : 975 andavarianceof : 39 : 01 = : 40 2 : 40 : 017 .It wasrecenteredandrescaledtohaveameanof : 5 andvarianceof 1 = 12 ,thesame astheU ; 1 distribution. Densitiesofthe Y in 'sareinFigure8.2.Asthesamplesizeincreasesfrom n =10 to n =270 ,the Y in 'sfrombothdistributionsgetclosertotheirexpectedvalueof

PAGE 422

8.9.ASYMPTOTICS 409 0.5.That'stheLawofLargeNumbersatwork.Theamountbywhichthey'reoff theirmeangoesfromabout : 2 toabout : 04 .That'sCorollary1.10atwork.And nally,as n !1 ,thedensitiesgetmoreNormal.That'stheCentralLimitTheorem atwork. Notethatthedensityofthe Y in 'sderivedfromtheU ; 1 distributionisclose toNormalevenforthesmallestsamplesize,whilethedensityofthe Y in 'sderived fromtheBe : 39 ;: 01 distributioniswayoff.That'sbecauseU ; 1 issymmetric andunimodal,andthereforeclosetoNormaltobeginwith,whileBe : 39 ;: 01 isfar fromsymmetricandunimodal,andthereforefarfromNormal,tobeginwith.So Be : 39 ;: 01 needsalarger n tomaketheCentralLimitTheoremwork;i.e.,tobea goodapproximation. Figure8.3isforthe Z in 's.It'sthesameasFigure8.2exceptthateachdensity hasbeenrecenteredandrescaledtohavemean0andvariance1.Whenputonthe samescalewecanseethatalldensitiesareconvergingtoN ; 1 Figure8.1:TheBe : 39 ;: 01 density Figure8.1wasproducedby x<-seq.01,.99,length=80 plotx,dbetax,.39,.01,type="l",ylab="",xlab=""

PAGE 423

8.10.EXERCISES 410 Figure8.2wasgeneratedbythefollowing R code. samp.size<-c10,30,90,270 n.reps<-500 Y.1<-matrixNA,n.reps,maxsamp.size Y.2<-matrixNA,n.reps,maxsamp.size foriin1:n.reps{ Y.1[i,]<-runifmaxsamp.size,0,1 Y.2[i,]<-rbetamaxsamp.size,0.39,.01-.975* sqrt.4^2*1.4/.39*.01*12+.5 } parmfrow=c,2 fornin1:lengthsamp.size{ Ybar.1<-applyY.1[,1:samp.size[n]],1,mean Ybar.2<-applyY.2[,1:samp.size[n]],1,mean sd<-sqrt1/12*samp.size[n] x<-seq.5-3*sd,.5+3*sd,length=60 y<-dnormx,.5,sd den1<-densityYbar.1 den2<-densityYbar.2 ymax<-maxy,den1$y,den2$y plotx,y,ylim=c,ymax,type="l",lty=3,ylab="", xlab="",main=paste"n=",samp.size[n] linesden1,lty=2 linesden2,lty=4 } Themanipulationsintheline Y.2[i,]<-... areso Y.2 willhavemean 1/2andvariance1/12. 8.10Exercises 1.Let Y 1 ;:::;Y n beasamplefromN ; aSuppose isunknownbut isknown.Findaonedimensionalsufcient

PAGE 424

8.10.EXERCISES 411 Figure8.2:Densitiesof Y in fortheU ; 1 dashed,modiedBe : 39 ;: 01 dash anddot,andNormaldotteddistributions.

PAGE 425

8.10.EXERCISES 412 Figure8.3:Densitiesof Z in fortheU ; 1 dashed,modiedBe : 39 ;: 01 dash anddot,andNormaldotteddistributions.

PAGE 426

8.10.EXERCISES 413 statisticfor bSuppose isknownbut isunknown.Findaonedimensionalsufcient statisticfor cSuppose and arebothunknown.Findatwodimensionalsufcient statisticfor ; 2.Let Y 1 ;:::;Y n beasamplefromBe ; .Findatwodimensionalsufcient statisticfor ; 3.Let Y 1 ;:::;Y n i.i.d.U )]TJ/F41 11.9552 Tf 9.298 0 Td [(; .Findalowdimensionalsufcientstatisticfor .

PAGE 427

B IBLIOGRAPHY ConsumerReports ,June:366,1986. D.F.AndrewsandA.M.Herzberg. Data .Springer-Verlag,NewYork,1985. H.Bateman.Ontheprobabilitydistributionof particles. PhilosophicalMagazine Series6 ,20:704,1910. RichardJ.BoltonandDavidJ.Hand.Statisticalfrauddetection:Areview. StatisticalScience ,17:235,1992. PaulBrodeur.Annalsofradiation,thecanceratSlaterschool. TheNewYorker ,Dec. 7,1992. JasonC.Buchan,SusanC.Alberts,JoanB.Silk,andJeanneAltmann.Truepaternal careinamulti-maleprimatesociety. Nature ,425:179,2003. D.P.Byar.TheveteransadministrationstudyofchemoprophylaxisforrecurrentstageIbladdertumors:Comparisonsofplacebo,pyridoxine,andtopical thiotepa.InM.Pavone-Macaluso,P.H.Smith,andF.Edsmyn,editors, Bladder TumorsandOtherTopicsinUrologicalOncology ,pages363.Plenum,New York,1980. GeorgeCasellaandRogerL.Berger. StatisticalInference .Duxbury,PacicGrove, secondedition,2002. LorraineDenbyandDarylPregibon.Anexampleoftheuseofgraphicsinregression. TheAmericanStatistician ,41:33,1987. A.J.Dobson. AnIntroductiontoStatisticalModelling .ChapmanandHall,London, 1983. 414

PAGE 428

BIBLIOGRAPHY 415 D.Freedman,R.Pisani,andR.Purves. Statistics .W.W.NortonandCompany,New York,1998. AndrewGelman,JohnB.Carlin,HalS.Stern,andDonaldB.Rubin. BayesianData Analysis .ChapmanandHall,BocaRaton,2ndedition,2004. S.GemanandD.Geman.Stochasticrelaxation,Gibbsdistributions,andthe Bayesianrestorationofimages. IEEETransactionsonPatternAnalysisandMachineIntelligence ,6:721,1984. W.K.Hastings.MonteCarlosamplingmethodsusingMarkovchainsandtheir applications. Biometrika ,57:97,1970. MichaelLavine.WhatisBayesianstatisticsandwhyeverythingelseiswrong. The JournalofUndergraduateMathematicsandItsApplications ,20:165,1999. MichaelLavine,BrianBeckage,andJamesS.Clark.Statisticalmodellingof seedlingmortality. JournalofAgricultural,BiologicalandEnvironmentalStatistics ,7:21,2002. JunS.Liu. MonteCarloStrategiesinScienticComputing .Springer-Verlag,New York,2004. Jean-MichelMarinandChristianP.Robert. BayesianCore:APracticalApproachto ComputationalBayesianStatistics .Springer-Verlag,NewYork,2007. N.Metropolis,A.W.Rosenbluth,M.N.Rosenbluth,A.H.Teller,andE.Teller. Equationofstatecalculationsbyfastcomputingmachines. JournalofChemical Physics ,21:1087,1953. RDevelopmentCoreTeam. R:ALanguageandEnvironmentforStatisticalComputing .RFoundationforStatisticalComputing,Vienna,Austria,2006.URL http://www.R-project.org .ISBN3-900051-07-0. ChristianP.RobertandGeorgeCasella. MonteCarloStatisticalMethods .SpringerVerlag,NewYork,1997. E.RutherfordandH.Geiger.Theprobabilityvariationsinthedistributionof particles. PhilosophicalMagazineSeries6 ,20:698,1910. MarkJ.Schervish. TheoryofStatistics .Springer-Verlag,NewYork,1995.

PAGE 429

BIBLIOGRAPHY 416 T.S.TsouandR.M.Royall.Robustlikelihoods. JournaloftheAmericanStatistical Association ,90:316,1995. JessicaUtts.Replicationandmeta-analysisinparapsychology. StatisticalScience 4:363,1991. W.N.VenablesandB.D.Ripley. ModernAppliedStatisticswithS .Springer,New York,fourthedition,2002. L.J.Wei,D.Y.Lin,andL.Weissfeld.Regressionanalysisofmultivariateincomplete failuretimedatabymodelingmarginaldistributions. JournaloftheAmerican StatisticalAssociation ,84:1065,1989. SanfordWeisberg. AppliedLinearRegression .JohnWiley&Sons,NewYork,second edition,1985.

PAGE 430

I NDEX particle,296 autocorrelation,350 autoregression,355 bandwidth,106 bias,376 case,216 cdf, see cumulativedistributionfunction CentralLimitTheorem,80 changeofvariables,12 characteristicfunctions,275 Chebychev'sInequality,79 chi-squareddistribution,312 consistency,376 coplots,126 correlation,53 covariance,51 covariancematrix,266 covariate,216 crosstabulation,116 cumulativedistributionfunction,271 cumulativehazardfunction,366 DASL, see DataandStoryLibrary, see DataandStoryLibrary,205 DataandStoryLibrary,105,141 density probability,264 densityestimation,105 dependence,54 distribution,2 Distributions Bernoulli,280 Beta,313 Binomial,14,279 Cauchy,333 Exponential,20 Gamma,305 inverseGamma,341 Multinomial,290 Negativebinomial,284 Normal,22,316 Poisson,17,292 standardmultivariateNormal,321 standardNormal,29 Uniform,303 errors,219 estimate,154 expectedvalue,30 explanatoryvariable,216 ttedvalues,224,250 tting,225 oor,376 formula,225 417

PAGE 431

INDEX 418 gammafunction,305 Gaussiandensity,316 generalizedmoment,39 genotype,290 half-life,309 histogram,104 independence,54 joint,265 mutual,265 indicatorfunction,56 indicatorvariable,56 Jacobian,268 Kaplan-Meierestimate,365 Laplacetransform,273 LawofLargeNumbers,79,80 likelihoodfunction,133 likelihoodset,156 linearmodel,219 linearpredictor,241 locationparameter,319 logisticregression,240 logit,241 marginallikelihood,140 MarkovchainMonteCarlo,344 maximumlikelihoodestimate,154 mean,30 median,95 Mediterraneantongue,30 mgf, see momentgeneratingfunction minimalsufcient,375 moment,38 momentgeneratingfunction,273 mosaicplot,117 multinomialcoefcient,291 multivariate changeofvariables,268 orderstatistic,96,375 outerproduct,324 parameter,14,132,279 parametricfamily,14,132,279 partialautocorrelation,353 pdf, see probabilitydensity, see probabilitydensity physics,8 Poissonprocess,311,312 predictedvalues,250 probability continuous,1,7 density,7 discrete,1,6 proportionalhazardsmodel,366 QQplots,111 quantile,95 Rcommands !,59 ==,4 [[]], see subscript [], see subscript #,4 %o%,324 ,225 abline,147 acf,350 apply,63 ar,355 array,67 arrows,38 as.factor,246 assignment,4,6 +<-+, 4

PAGE 432

INDEX 419 boxplot,61,109 c,9 cbind,76 contour,147 coplot,127 cor,53 cov,51 cumsum,379 data.frame,245 dbinom,17,282 density,12 dexp,22 diff,318,359 dim,73 dimnames,303 dmultinom,292 dnbinom,288 dnorm,23 dpois,20 expression,135 lter,355 tted,250 for,5 glm,244 hist,12 if,6 is.na,147 legend,22 length,74 lines,12 list,131 lm,225 log10,153 lowess,208 matplot,22 matrix,22,61 mean,12 median,95 mosaicplot,117 names,73 pacf,353 pairs,51 par,10 paste,22 pbinom,282 plot,9,20 plot.ecdf,98 plot.ts,350 pnbinom,288 print,5 qbinom,282 qnbinom,288 qqnorm,113 quantile,96 rbinom,39,282 read.table,73 rep,5,6 rmultinom,292 rnbinom,288 rnorm,26 round,379 sample,3 scan,131 segments,273 seq,20 sqrt,380 stripchart,108 subscript,73,74 sum,4,5 supsmu,208 Surv,365 survt,365 tapply,76 text,38 unique,104 var,12

PAGE 433

INDEX 420 while,59 Rdatasets airquality,202 attenu,202 beaver1,348,350 co2,64,348 discoveries,10 EuStockMarkets,348 faithful,120,202 iris,51 ldeaths,348 lynx,348 mtcars,202,229,250 PlantGrowth,214 presidents,348,350 Seatbelts,348 sunspot.month,348 ToothGrowth,100,184 UCBAdmissions,115 randomvector,264 regressionfunction,208 regressor,216 residual,208 residualplots,231 responsevariable,216 samplingdistribution,158 scaleparameter,319 scatterplotsmoothers,205 standarddeviation,34 standarderror,163 standardNormaldistribution,319 standardunits,26 stationarydistribution,344 StatLib,73 StrongLawofLargeNumbers,80 sufcientstatistic,372 variance,34 WeakLawofLargeNumbers,79

PAGE 434

I NDEXOF E XAMPLES 1970draftlottery,204 baboons,187 bladdercancer,359 CEOsalary,141 craps,2,3,15,49,58,59,157 FACE,64,144,236 hotdogs,105,210,216,225 IceCreamConsumption,220 neurobiology,127,298 O-rings,236 oceantemperatures,29,316 quizscores,109,148 RutherfordandGeiger,296 seedlings,18,43,49,56,136,148, 167,208,245 Slaterschool,136,140,193 421