|UFDC Home||myUFDC Home | Help|
This item has the following downloads:
1 CONFRONTING ASSUMPTIONS OF OCCUPANCY ESTIMATION AND SPECIES DISTRIBUTION MODELS: THE PROBLEM OF DETECTABILITY IN WILDLIFE POPULATIONS By CHRISTOPHER T. ROTA A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2009
2 2009 Christopher T. Rota
3 To Beth, y our support made this possible
4 ACKNOWLEDGMENTS Special thanks are due to my major advisor, Robert Fletcher, for his guidance, support, and encouragement throughout this entire process. For chapter 1, I thank co authors R. Dorazio and M. Betts for providing invaluable guidance in developing this chapter. Committee member J. Hayes, and lab mates M. Acevedo, and J. Hostetler provided valuable insight and feedback on earl ier versions of this chapter I t hank A. Peterson and J. Csoka for their field assistance, and the numerous landowners that allowed access to the properties. PPL Montana, the Bureau of Land Management, and the National Research Initiative of the USDA Cooperative State Research, Education and Extension Service (#2006 55101 17158) provided support for this project. Thanks are also due to Hubbard Brook field assistants: B. Griffith, M. Smith, T. Weidman, and E. Whidden and to the staff of the Hubbard Brook Experimental Forest. Financial su pport was provided by the US National Science Foundation LTER program and Oregon State University. The Hubbard Brook research was conducted under the auspices of the Northern Research Station, Forest Service, US Dept. of Agriculture, Newton Square, PA, an d is a contribution of the Hubbard Brook Ecosystem Study. For chapter 2, I thank R. Dorazio with help in developing hierarchical models. I also thank co author J. Evans for all his assistance with the Maxent models. This work was supported by the Nationa l Research Initiative of the USDA Cooperative State Research, Education and Extension Service, grant #2006 55101 17158. The landbird database was created through support from USFS Northern Region (03 CR 11015600 019). I thank J. Young and A. Noson for prov iding support on the avian database and GIS
5 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ ............... 4 LIST OF TABLES ................................ ................................ ................................ ........................... 7 LIST OF FIGURES ................................ ................................ ................................ ......................... 8 ABSTRACT ................................ ................................ ................................ ................................ ..... 9 CHAPTER 1 OCCUPANCY ESTIMATION AND THE CLOSURE ASSUMPTION .............................. 11 Introduction ................................ ................................ ................................ ............................. 11 Robust Design Sampli ng ................................ ................................ ................................ ........ 13 Dynamic Occupancy Models ................................ ................................ ................................ .. 14 Application of Robust Design Sampling to Breeding Bird Distributions .............................. 16 Application Methods ................................ ................................ ................................ ....... 16 Application Results ................................ ................................ ................................ ......... 18 Simulation Study ................................ ................................ ................................ .................... 19 Simulation Methods ................................ ................................ ................................ ......... 19 Simulation Results ................................ ................................ ................................ ........... 20 Discussion ................................ ................................ ................................ ............................... 21 2 DOES ADDRESSING IMPERFECT DETECTION IMPROVE PREDICTIVE PERFORMANCE OF SPECIES DISTRIBUTION MODELS? ................................ ............ 32 Introduction ................................ ................................ ................................ ............................. 32 Methods ................................ ................................ ................................ ................................ .. 34 Field Methods ................................ ................................ ................................ .................. 34 Transect and Species Selection ................................ ................................ ....................... 35 Habitat and Detection Covariates ................................ ................................ .................... 36 Logistic Regression Models ................................ ................................ ............................ 38 Hierarchical Occupancy Models ................................ ................................ ..................... 39 Ma ximum Entropy Models ................................ ................................ .............................. 41 Comparing Predictive Performance ................................ ................................ ................ 42 Results ................................ ................................ ................................ ................................ ..... 44 Discussion ................................ ................................ ................................ ............................... 45 APPENDIX A SIMULATING THE DISTRIBUTION OF THE LIKELIHOOD RATIO TEST STATISTIC ................................ ................................ ................................ ............................ 56
6 B BIASES IN DETECTION PROBABILITY, P RESULTING FROM VIOLATION OF THE CLOSURE ASSUMPTION ................................ ................................ ........................... 57 C STATISTICAL POWER OF LIKELIHOOD RATIO TEST ................................ ................ 59 Calculating Expected Data ................................ ................................ ................................ ..... 59 Calculating Noncentrality ................................ ................................ ................................ ....... 61 D SCIENTIFIC NAMES AND BODY MASS ................................ ................................ .......... 64 LIST OF REFERENCES ................................ ................................ ................................ ............... 66 BIOGRAPHICAL S KETCH ................................ ................................ ................................ ......... 72
7 LIST OF TABLES Table page 2 1 Regression coefficients from logistic and occupa ncy species distribution models .......... 50 2 2 Regression coefficients for estimates of detection probability from occupancy models. ................................ ................................ ................................ ............................... 51 2 3 Area Under the Curve (AUC) scores for each Species Distribution Modeling (SDM) approach. ................................ ................................ ................................ ............................ 51 C 1 The probability of observing each of four mutually exclusive patterns of detections and non detection. ................................ ................................ ................................ .............. 63 D 1 Scientific names and body mass ................................ ................................ ........................ 64
8 LIST OF FIGURES Figure page 1 1 Relative weights of open and closed occupancy models for breeding birds ..................... 28 1 2 Estimated probability of site occupancy from RD models relative to standard occupancy models ................................ ................................ ................................ .............. 29 1 3 Mean estimated probability of site occupancy from 1000 replicate simulatio ns ............... 30 1 4 The power of a likelihood ratio test to detect changes in site occupancy between primary sampling periods. ................................ ................................ ................................ 31 2 1 Location of permanently marked Northern Region Landbird Monitori ng Program transects ................................ ................................ ................................ .............................. 52 2 2 Results of jackknife test of relative variable importance from Maxen t species distribution models. ................................ ................................ ................................ ............ 53 2 3 Predicted habitat suitability of Olive sided Flycatcher ( Contopus cooperi ) in western Montana and northern Idaho. ................................ ................................ ............................. 55 B 1 Mean estimated probability of detection from 1000 replicate simulations. ....................... 57 B 2 Mean estimated probability of detection from 1000 replicate simulations. ....................... 58
9 Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science CONFRONTING ASSUMPTIONS OF OCCUPANCY ESTIMATION AND SPECIES DISTRIBUTION MODELS : THE PROBLEM OF DETECTABILITY IN WILDLIFE POPULATIONS By Christopher T. Rota May 2009 Chair: Robert J. Fletcher, Jr. Major: Wildlife Ecology and Conservation Recent advances in occupancy estimation that adjust for imperfect detection have provided impr ovements over traditional approaches by reducing bias in estimated occupancy rates and species environment relationships To estimate and adjust for detectability, occupancy modeling requires the strong assumption of closure Violations of this assumptio n could bias parameter estimates ; however, little work has assessed model sensitivity to violations of this assumption or how commonly such violations occur in nature I first advance a modeling procedure which can test for violations of this assumption. I apply this approach to two avian point count data sets to evaluate closure over time scales typical of many wildlife investigations Using a simulation study, I then contrast this approach with current approaches and evaluate the sensitivity of paramete r estimates to violations of closure A pplication of our approach to point count data indicates that habitats may frequently be open over time scales typical of many occupancy investigations Further, s imulations suggest that models assuming closure are sensitive to changes in occupancy and that our approach can effectively test for closure. Next, I develop species distribution models using an occupancy modeling approach. Species distribution models are being used for a variety of problems in conservation biology. In
10 many applications, perfect detectability of species is assumed. While this problem has been addressed through the development of occupancy models, we still know little regarding whether addressing imperfect detection improves the predictive performance for species distribution models in nature. Here, I contrast logistic regression models of species occurrence that do not correct for detectability to hierarchical occupancy models that expl icitly estimate and adjust for detectability, and maximum entropy models that circumvent the detectability problem by only using data on known presence locations. I contrast these models for four landbird species that vary in abundance and detectability us ing a large scale, long term monitoring database across western Montana and northern Idaho. Overall, logistic regression models were similar to, or better than, other approaches, in terms of predictive accuracy, as measured by the Area under the ROC curve (AUC). I conclude by discussing the advantages and limitations to each approach for developing large scale species distribution models.
11 CHAPTER 1 OCCUPANCY ESTIMATION AND THE CLOSURE ASSU MPTION Introduction Esti mating and interpreting patterns of occupancy lie at the heart of many questions in ecology and problems in conservation. For example, metapopulation theory often explores variation in patch occupancy in fragmented landscapes (Hanski 1994) Species distr ibution models, which are widely used in guiding conservation and management decisions, frequently rely on observed patterns of detection and non detection (Guisan et al. 2006) Additionally, occupancy estimation can provide valuable information on popula tion trends when more detailed demographic or abundance estimates are not practical (e.g. Bailey et al. 2004). Traditional approaches to occurrence estimation, such as logistic regression or Incidence Function Models (Hanski 1994), assume perfect detection of species. Recently, these approaches have been criticized because even modest amounts of false absences (i.e., modeling a species as absent when it is in fact present) can bias parameter estimates in metapopulation models and predicted habitat relation ships in investigations of habitat use (Moilanen 2002, Tyre et al. 2003, Gu and Swihart 2004, Martin et al. 2005). Because of the bias introduced by non detection errors in traditional occupancy studies, several recent investigations have focused on how to model occupancy, given imperfect detection (Geissler and Fuller 1987, Azuma et al. 1990, MacKenzie et al. 2002, Tyre et al. 2003, Stauffer et al. 2004, MacKenzie et al. 2006). Of these new techniques, MacKenzie and Royle (2005) suggest that the approach of MacKenzie et al. (2002 ) is the most flexible and that other approaches are special cases of their general occupancy model. A major advantage of approach is its ability to model occupancy and detection probability simulta neously This has provided significant improvements over traditional
12 approaches, which is reflected in a recent surge in occupancy studies (Marsh and Trenham 2008). Additionally, these models continue to become more flexible, as they can now address issu es of species co occurrence, spatial autocorrelation (Royle et al. 2007) and multiple occupancy states (Royle and Dorazio 2008). approach requires multiple surveys at each site. Detection probability is then estimated from the pattern of detections and non detections that arise from these multiple surveys. A necessary assumption is that sites are closed to changes in occupancy between surveys (i.e., the closure assump tion) if a site is occupied during at least one survey, it is assumed to have been occupied during all surveys, and any non detection during a survey A commonly used sampling approach for occupancy studies is to visit a site multiple times and conduct a single survey during each site visit, treating the surveys from each site visit as replicate observations (e.g., amphibians: Bailey et al. 2004, Kroll et al. 2008, birds: Tipto n et al. 2008 Winchell and Doherty 2008, mammals: Ball et al. 2005). We will refer to this approach as a standard occupancy sampling protocol. Site visits are frequently separated by periods of weeks, or even months, and sites are assumed to be closed t o changes in occupancy during these time periods. Violations of this assumption may lead to biased estimates of detection probability and thus occupancy. However, little work has been done to assess how violations of the closure assumption may affect occ upancy estimates, and MacKenzie et al. (2006) have only inferred the strength and direction of these biases based on recapture models. Further there has been little guidance in testing for violations of this assumpt ion (for an exception see Betts et al. 2008)
13 Here, we address the closure assumption for occupancy estimation by providing a sampling protocol and modeling approach that allows testing for closure. We advocate the use which requires the observer to conduct multiple surveys during each site visit. We show that this approach permits estimation of transitions in site occupancy (i.e., local colonization and extinction) and formal statistical tests of closure between site visits. Using simulations, we assess the relative performance of standard occupancy modeling approaches and those based on robust design when sites are open to changes in occupancy between site visits. Finally, using two datasets on bird distributions, w e test the likelihood of closure over time periods typical of many wildlife occupancy investigations Our comparisons provide insight to the likelihood of closure during the breeding season. Robust Design Sampling An intuitive and practical way to address the closure assumption is to sample populations recapture ing periods nested within primary sampling periods. Populations are assumed to be closed to demographic changes between secondary sampling periods and open to demographic changes between primary sampling periods. We apply principles of robust design (RD) sampling in an occupancy estimation context by considering individual site visits primary sampling periods and multiple surveys conducted during each site visit secondary sampling periods. We define site as any single visit to a site during which one or more surveys are conducted to assess detection/non detection Thus, we assume that sites are open to changes in occupancy between site visits, but are closed to changes in occupancy during site visits. Conducting multiple surveys during each site visit minimizes the time over which closure is assumed while still providing the detection and non detection data necessary to estimate detection probability.
14 Robust design sampling can take several forms For example, MacKenzie et al. (2006) suggest c onducting a fixed number of replicate surveys or having multiple individuals conduct simultaneous surveys at the same site during the same visit. Additionally, Farnsworth et al. (2002) and Alldredge et al. (2007) suggest dividing timed surveys into severa l shorter sampling intervals Since detection/non detection is assessed during each sampling interval, each sampling interval can effectively be considered an individual survey. We refer to this approach of conducting a fixed number of replicate surveys during each primary sampling period a fixed replicate RD protocol. With a fixed replicate RD protocol, single site visit estimates of occupancy can be computed using the approach of MacKenzie et al. (2002). When conducting multiple surveys during a si ngle site visit, assuring independence between surveys may prove problematic. For example, a species detected during one survey may be easier to detect during subsequent surveys once its location is known. For that reason, MacKenzie et al. (2006) suggest adopting a removal sampling protocol, which consists of surveying for a species only until it is first detected, up to a maximum of J surveys (Azuma et al. 1990, MacKenzie and Royle 2005). Since surveying stops once a species is first detected, the assumption of independence between surveys is less problematic. We refer to this approach of surveying during a primary sampling period only until a species is first detected as a removal RD protocol. With a removal RD protocol, single site visit estimates of occupan cy can be estimated p. 102) single season removal model. The likelihood function for this model is identical to that of MacKenzie et al (2002) with missing data after the first detection. Dynamic Occupancy Models Transiti ons in occupancy between site visits can be estimated by fitting dynamic models to data collected using RD protocols. These models include the probability of local colonization
15 ( ) and extinction ( ) as parameters In this context, is the probability t hat an unoccupied site at time t will become occupied at time t +1. Local extinction, is the probability that an occupied site at time t will become unoccupied at time t +1. Data collected with a fixed replicate RD protocol can be fit to MacKenzie et al. season removal model. The likelihood function for a two season dynamic removal model is : where p 1 and p 2 are the conditional probabilities of detecting a species, given presence, during times 1 and 2, respectively, j i 1 and j i 2 denote the number of surveys until the first detection of a species at site i during times 1 and 2, respectively (note that if a species remains undetected at site i during time t then j it = J the maximum number of surveys), y i 1 and y i 2 are binary indicators of whether a species is detected ( y =1) or not ( y =0) at site i during times 1 and 2, respectively, and N is the number of si tes surveyed. The framework for extending this dynamic removal model to three or more seasons is of the model to reflect a removal sampling protocol. To test for closure between site visits we use a likelihood ratio comparison of open 0) and closed models This test is a ratio of the likelihoods of two nested models calculated using the maximum likelihood estimate (MLE) of parameters under the null 0).
16 The likelihood ratio test statistic is calculated as the two nested models. Under the standard regularity conditions, the limiting distribution the test statistic under the null hypothesis is 2 with degrees of freedom equal to the difference in dimensionality between the two models (Royle and Dorazio 2008). However, in our situation 2 and zeros (Self and Liang 1987). The mixing propo rtion of this distribution depends on the Fisher Information and is difficult to calculate; however, the distribution of the test statistic can be approximated by simulating hypothetical data sets under the null hypothesis (Appendix A). Application of Robu st Design Sampling to Breeding Bird Distributions Application Methods As an application, we fit standard occupancy and RD models to avian point count data. We used two separate datasets for this application that illustrate different RD sampling designs. Our first dataset was collected along approximately 600 km of the Madis on and Missouri Rivers, Montana, during the 2004 breeding season sites was each visited twice, once between 25 May 15 June and again between 15 June 10 July A s tandard 10 minute, 50 m radius point count survey w as conducted during each site visit. Each 10 minute survey was further divided into four 2.5 minute sampling intervals, which served as secondary sampling periods. This dataset was collected using a removal RD sampling protocol, so sampling stopped for a species during a site visit once it was detected. For a more detailed description of field methods, see Fletcher and Hutto (2008). Our second dataset was collected at the Hubbard B rook Experimental Forest, New Hampshire, during the 2007 breeding season Each of 184 sites was visited three times between 2 June and 2 July. A standard 10 minute, 50 m radius
17 point count was conducted during each sit e visit. Each 10 minute survey was further divided into three 3 min 20 s sampling intervals, which served as secondary sampling periods. This dataset was collected using a fixed replicate RD sampling protocol, so each species was re sampled during each s ampling interval. For a more detailed description of field methods, see Betts et al. (2008). We fit open and closed RD models to both datasets. We treated closed RD models as a restricted version of open RD models to a truncated version of each dataset. We generated truncated datasets by collapsing information from each primary sampling period into either a 1 if a species was detected or a 0 if a species remained undetected. This effectively treated each site visit as a single survey. Because apparent changes in site occupancy could potentially be explained by differences in detectability between site visits, we fit RD models estimating both c onstant detection probability and site visit specific detection probability. This resulted in fitting four RD models for each species: closed with constant p closed with site visit specific p open with constant p and open with site visit specific p F or both datasets, we fit models to species that were detected on > 10% of sites surveyed ( riparian = 28 species Hubbard Brook = 18 species ). Because standard occupancy and RD models are fit to different data, formal comparison of open and closed models c an only be performed with RD models. Thus, we tested for closure using likelihood ratio comparisons of nested open and closed RD models. In addition, we calculated the Bayesian information criterion (BIC) for each RD model as 2 log( L ) + K log( N ) where K denotes the number of estimable parameters (Burnham and Anderson 2002). We then compute d a set of model weights for ranking nested and non nested models fit to each species' data as follows :
18 r = BIC i BIC min denotes the difference between the i th model's B IC value and the smallest B IC value in the set of R candidate models Application Results For 16 of the 28 species considered from the riparian dataset open RD models received more than half of the model weight according to BIC weights (Fig. 1 1a ). Likelihood ratio tests provided similar results, rejecting the null hypothesis of closure between primary sampling periods for 20 species ( p <0.05 Figure 1 1a ). In general, apparent colonization or extinction events with the riparian dataset were best explained by transitions in occupancy rather than c hanges in detectability; i.e., seasonal changes i n detection probability were insufficient for explaining apparent transitions. In only two clear cases (house finch [ Carpodacus mexicanus ] and European starling [ Sturnus vulgaris ]) did closed models with site visit specific detection probability best expl ain apparent chan ges in site occupancy (Figure 1 1a ) In all cases, open RD 2 2a ) For 17 of 18 species considered from the Hubbard Brook dataset, open RD models receive d more than half of the model weight, a ccording to BIC weights (Fig. 1 1b ). Likelihood ratio tests provided similar results, rejecting the null hypothesis of closure between site visits for all 18 species ( p <0.01, Figure 1 1b ). Apparent colonization and extinction events with the Hubbard Brook dataset were always better explained by transitions in occupancy rather than changes in detection probability (Figure 1 1b ). In all cases, open RD models provided lower tandard occupa ncy models (Fig 1 2b ).
19 Simulation Study Simulation Methods To assess how violations of the closure assumption can bias parameter estimates, we conducted simulation studies with sites open and closed to changes in occupancy between site visits We simulated two levels of initial probability of site occupancy and single site visit detection probability p ( p high = 0.7, low = 0.3), open to varying levels of probability of local between site visits = 0 0.9 5 = populations are closed). To guide our simulations, we adopted the sampling protocols used by Fletcher and Hutto (2008; described above ), who used a removal RD protocol, and Betts et al. (2008; also described above ), who used a fixed replicate RD protocol. We based one set of simulations on Fletcher and Hutto (2008) by simulating observations from two primary sampling periods, each of which was divided into four secondary sampling periods ( J = 4 surveys). We based a second set of simulations on Betts et al. (2008) by simulating observations from 3 primary sampling periods, each of which was divided into 3 secondary sampling periods ( J = 3 surveys). In both approaches, a primary sampling period corresponds to a single s ite visit. For each replicate dataset, we simulated observations on N = 1000 sites. A site was initially occupied with probability and species were detected on occupied sites with probability 1 (1 p ) 1/ J during each survey. During subsequent site v isit s species were absent from sites occupied during the previous unoccupied during the previous We r andomly generated 1000 replicate datasets for each combinatio n of p Using the data generating process described above, we simulated three different sampling protocols to address the closure assumption. For a fixed replicate RD protocol, each primary
20 sampling period consisted of J secondary sampling pe riods, regardless of detection history. For a removal RD protocol, we only surveyed during a primary sampling period until a species was first detected, for a maximum of J surveys. For a standard occupancy protocol, we truncated information from primary sampling periods into a 0 for non detection and 1 for detection, effectively treating each primary sampling period as a single survey. For both RD protocols, we fit open and closed models. Fixed replicate RD models had J s surveys, where s is the number of primary sampling periods. Removal RD models had a maximum of J s surveys, but surveys stopped after the initial detection in any primary sampling period. Open RD models estimated transitions in occupancy between primary sampling perio ds and closed RD models assumed closure between primary sampling periods. For the standard occupancy sampling protocol, we fit single season occupancy models (MacKenzie et al. 2002) to the truncated data, with the number of surveys equal to the number of primary sampling periods. This resulted in fitting five models to each replicate dataset. Simulation Results Estimates of from closed models are sensit ive to changes in occupancy between primary sampling periods (Fig. 1 3 ), whereas open models provided unbiased estimates of sampling protocol, and is less pronounced wi th closed RD models. Interestingly, closed fixed replicate RD models perform almost as well as open RD models when sites are open to extinction (Fig. 1 3 a, b). In most cases changes in site occupancy between primary sampling periods leads to overestimate s of for closed models (Fig. 1 3 ) The exception is for intermediate values of extinction with a closed fixed Estimates
21 of are most sensitive to local colonization events, while estimates are relat ively robust for low probabilities of local extinction between primary sampling periods Interestingly, closed models are most sensitive to extinction events between primary sampling periods when occupancy rates are high (Fig. 1 3 a), due to an increased number of sites that can possibly go extinct. Conversely, closed models are most sensitive to colonization events between primary sampling periods when occupancy rates are low (Fig. 1 3 d), due to an increased number of sites that can possibly become colon ized. These overestimates of are driven by underestimates of p (Appendix B ). Importantly, the power of a likelihood ratio test to detect violations of the closure assumption increases with increasing extinction and colonization rates between site visi ts (Fig. 1 4 see Appendix C for a description of power calculations). Power to detect violation of the closure assumption is slightly greater for a fixed replicate RD sampling protocol than for a removal RD sampling protocol. Also, the power to detect e xtinction events between site visits is greatest when occupancy is high, whereas the power to detect colonization events between site visits is greatest when occupancy is low. Discussion The assumption that sites are closed to changes in occupancy over a s eason is a major assumption of standard occupancy models. Recently, MacKenzie et al. (2006) have suggested the assumption of closure can be relaxed if changes in site occupancy are random (by drawing analogy from Kendall ( 1999 ) pancy is estimated. This relaxation may be appropriate in some situations, especially if the species of interest is wide ranging relative to survey sites and movement into and out of the sampling area is effectively random. Additionally, conducting repeated surveys close in time, either on the same day or during
22 subsequent days, has been advocated as a means to maximize the likelihood of closure (MacKenzie et al. 2006). Occupancy models share many similarities with traditional mark recapture models, such that investigators have drawn analogy from mark recapture models to infer potential biases resulting from violation of the closure assumption in occupancy estimation. For example, MacKenzie et al. (2006) used results from Kendall (1999) to suggest that if only two survey occasions are used and movement into or out of the study area between surveys is non random (i.e., immigration or emigration only movement), estimates of occupancy should reflect the probability of si te occupancy during either the first or second survey occasion (for emigration or immigration, respectively). However, our simulations demonstrate that violation of the closure assumption consistently leads to overestimates of the probability of site occu pancy during either survey occasion, or for both occasions combined. Biases in estimates of the probability of site occupancy for standard occupancy models are driven largely by the detectability component of these models If a species is detected at le ast once at a site during a primary sampling period, all other non detections during that same period open to changes in occupancy during that period, some of period. Thus, incorrectly assuming closure deflates estimates of detection probability and inflates estimates of occup ancy. Our application of RD sampling makes the assumption of closure less problematic because closure is assumed over a very short time period While methods of maximizing the likelihood of closure have been proposed, the importance of this issue has been unclear because no systematic approach has examined how
23 sensitive occupancy models are to violations of this assumption. Understanding the level of sensitivity in parameter estimates to violations of model assumptions is important because it allows for d etermining the amount of effort necessary to properly safeguard against violations of these assumptions. O ur simulations have demonstrated that estimates of occupancy are sensitive even to small violations of the closure assumption It is important to no te, however, that our work only addresses bias in estimates of the probability of site occupancy, not in estimated habitat relationships. Future work is still needed to address the strength and direction of bias in estimates of habitat relationships arisi ng from violations of the closure assumption. Our application of RD models to point count data provides results consistent with the simulation study. M odels based on RD sampling yielded higher estimate s of detection probabilities and lower estimates of oc cupancy relative to standard occupancy models, which is expected if sites are open to changes in occupancy between site visits Our application also suggests that habitats may frequently be open to changes in site occupancy over time scales typical of man y wildlife occupancy investigations The average time between site visits for the riparian dataset was 3.5 weeks, and the average time between site visits for the Hubbard Brook dataset was 6 8 days. Consistently high support for open RD models indicates support for the hypothesis that many birds are shifting territories and/or migratory birds are arriving late or leaving early even over th ese relatively short time span s Within season shifts in territories may occur for several reasons. For example, B etts et al. (2008) demonstrated apparent within breeding season movement of Black throated Blue Warblers ( Dendrioca caerulescens ) along a habitat gradient. These warblers presumably shifted territories as more reliable information about habitat quality be came available. Studies using radio telemetry and/or color banding have demonstrated that individuals may shift territories
24 after failed breeding attempts (Walk et al. 2004, Fletcher et al. 2006) or in response to seasonal fluctuations in food availabilit y (Klemp 2003). Additionally, Hamer et al. (2008) have demonstrated seasonal shifts in habitat use by amphibians exploiting ephemeral wetlands. In these situations, occupancy models that assume closure between surveys widely separated in time are likely to produce biased estimates of occupancy and detection probability. The sensitivity of model performance to changes in site occupancy and the likelihood that sites are open to changes in site occupancy during timescales typical of many occupancy studies un derscore the importance of addressing the closure assumption whenever possible. Ideally, this assumption should be addressed in the study design phase, by planning for surveys to be conducted as close in time as possible. This could include repeat survey s during the same day or on consecutive days, as recommended by MacKenzie et al. (2006). If conducting repeated surveys close in time, both fixed replicate and removal RD approaches offer advantages and disadvantages A r emoval approach increases the li kelihood of independence among surveys by stopping surveys during a primary sampling period once a species is detected but the power to detect a violation of closure is less than with fixed replicate models Closed fixed replicate approaches produce less biased parameter estimates when the closure assumption is violated, but assuring independence between surveys is more difficult. One approach to this problem is to model detections at time t as a function o f detections during time t 1 (within the same primary sampling period, see Betts et al. (2008)). If repeated surveys are separated over longer timescales, our approach provides a method to test for closure by dividing each site visit into two or more secon dary sampling periods For many investigations, implementing our recommendations should require only minor modifications of existing protocols. For example, timed occupancy surveys such as avian point
25 counts or amphibian call surveys can simply divide in dividual surveys into several shorter sampling intervals. Using a RD design, these sampling intervals can be treated as secondary sampling periods and each site visit can be treated as a primary sampling period, which will allow testing for closure by com paring open and closed models One problem with our approach of treating individual site visits as a primary sampling period is that temporary site absences can be confounded with extinction or colonization events. For example, if a territory were to overlap a survey site, but not be completely contained within that site, what might be an apparent extinction or colonization event could simply be an animal st ill present in its territory, but absent from the survey site. This is likely to have occurred with the dataset s we considered, especially for wide ranging species and those that occur at low densities (which may correspond to greater territory size). Te asing apart true colonization and extinction events from temporary absence from a site is not possible with th ese dataset s and our estimates likely represent an upper limit to changes in site occupancy. This issue should also be considered in the samplin g design. One approach for addressing this issue is to increase the area sampled. If temporary immigration and emigration is driving apparent immigration and emigration, expanding the area sampled should provide more support for closed models. This is l ess of a problem for the riparian dataset because most forest patches sampled were small, and increasing the point count radius could have resulted in sampling non forest habitat. Therefore, we expect the likelihood of temporary immigration or emigration is probably less than for more contiguous habitats. Hubbard Brook is characterized by relatively contiguous habitat, which makes the issue of temporary site absences more relevant for this dataset. We further analyzed the Hubbard Brook data at a 100 m p oint count radius to determine if inferences on violations of closure were
26 sensitive to sampling area. Tests for closure at this radius did not change our inference for any species (p<0.01), although BIC weight of closed models increased substantially for the yellow bellied sapsucker and decreased substantially for white breasted nuthatch. This suggests that, at least for the yellow bellied sapsucker, a wide ranging species, temporary site absences may be occurring at least some of the time. If temporar y immigration or emigration was driving apparent colonization and extinction events, we should further expect to consistently reject the closure hypothesis for large, wide ranging species. However many species for which the closure hypothesis was support ed are wide ranging species, such as the black headed grosbeak ( Pheucticus melanocephalus ) downy woodpecker ( Picoides pubescens ) and black billed magpie ( Pica hudsonia ) Further, species body mass (a surrogate of territory size, see Bowman  for ju stification and Appendix D for body mass values used) was not a significant predictor of BIC model weight for either the riparian or Hubbard Brook data (Rota; unpublished results ). Thus, despite the potential to confound temporary site absences with true immigration or emigration, our analyses strongly suggest that sites were open to changes in occupancy over the duration of these two studies. Adequately addressing the closure assumption is critical to drawing valid inference for both ecological questions and conservation problems. Our simulations demonstrate that standard occupancy models are highly sensitive to violations of the closure assumption and will always overestimate the probability of site occupancy when closure is violated. Further, this bias is dependent on the amount of immigration and emigration. These parameters are unlikely to remain constant within or among seasons, making inference even on relative occupancy from closed models problematic. Because of the direction of these biases, an uncritical application of closed occupancy models could have dire consequences when monitoring rare or declining
27 species for conservation and management decisions. We recommend explicitly addressing the assumption of closure whenever possible, whether it is conducting repeat sampling over a timescale (hours to days) that will maximize the likelihood of closure, or adopting a design which will allow testing for closure. Adequately addressing assumptions is an essential part of any modeling process, and an improved focus on the closure assumption will play a crucial role for managers and scientists alike in providing meaningful estimates of occupancy.
28 Figure 1 1 Relative weights of open and closed occupancy models for breeding birds in two areas. A ) Bayesian information criteria (BIC) model weight for all species detected on >10% of riparian point counts ( N =165), Montana, using a removal RD sampling protocol. B ) BIC model weight for all species detected on >10% of Hubbard Brook point cou nts ( N =184), New Hampshire, using a fixed replicate sampling protocol. Significant likelihood ratio tests for closure are marked with asterisks (*: p <0.05, **: p <0.01). For a list of scientific names, see Appendix D.
29 Figure 1 2 standard occupancy models for all species detected on >10% of riparian dataset point count s. Both models assume constant detection probability. Open circles indicate species for which the closu re hypothesis was supported. B) fixed replicate RD models relative to standard occupancy models for all species detected on >10% of Hubbard Brook dataset point counts. Both models assume constant detection probability. The dashed line indicates a 1:1 relationship, or no standard occupancy mode ls.
30 Figure 1 3. Mean estimated from standard occupancy (MacKenzie et al. 2002), fixed replicate RD and removal RD models each calculated from 1000 replicate simulations. A) Extinction Extinct ion only, All data were simulated using two primary sampling periods, with four secondary sampling periods nested within each primary period. For all simulation s, p = 0.7.
31 Figure 1 4. The power of a likelihood ratio test to detect changes in site occupancy between primary sampling periods. made assuming two primary sampling periods, each with four secondary sampling periods. Note that these power calculations were made assuming surveys on 165 sites (i.e., the sampling protocol from Fl etcher and Hutto (2008)).
32 CHAPTER 2 DOES ADDRESSING IMPE RFECT DETECTION IMPR OVE PREDICTIVE PERFORMANCE OF SPECI ES DISTRIBUTION MODE LS? Introduction Accurate predictions of species distributions are essential for conservation (e.g. Carroll and Johnson 2008) management (Fernandez et al. 2 006) and forecasting the effects of global climate change (Lassalle et al. 2008) Species distribution models (SDMs) are a commonly used tool to describe the geographic distribution of plants and animals by quantifying species environment relationships (Guisan and Thuiller 2005) Recent advances in statistical techniques and geographic infor mation system (GIS) technology has caused a proliferation in the number of SDM approaches (Guisan and Zimmermann 2000, Guisan and Thuiller 2005) While a pplications exist for predicting both the abundance (e.g. Thogmartin et al. 2004) and occu rrence of species across space often only information on occurrence is available for developing models Logistic regression is a commonly used approach to modeling species occurrence (Guisan and Zimmermann 2000) This approach relies on binomial response data in the form of presences or absences (o r more frequently, detection/non detection) (Manly et al. 2002) O btaining reliable presence/absence data is often costly and difficult. Additionally, records of occurrence of many species are sparse or known only through museum records. This has prompted researchers to develop a variety of techniques for modeling presence only data. A recent comprehensive analysis of presence only SDM s demonstrates that many of these techniques have high predictive performance (Elith et al. 2006) A common problem with the above approaches is their inabilit y to properly account for the imperfect detection of species. For example, SDM approaches that rely on presence/absence data, such as logistic regression, require the strong assumption that species are perfectly detected, a n assumption that is unlikely to hold in most situations (MacKenzie et al. 2002) Further, recent
33 investigations have demonstrated that failure to account for imperfect detection can lead to biased estimates of habitat relationships for logistic regression models (Tyre et al. 2003, Gu and Swihart 2004, Martin et al. 2005) Presence only methods can potentially circumvent the detectability issue by only using data from locations where a species was known to occur. As a result, presence only methods are unable to directly estimate the probability of occurrence (Ward et al. 2008) instead estimating a relative measure of suitability (Phillips et al. 2006) Recently developed occupancy modeling approaches that account for imperfect detection (occupancy modeling approaches h ereafter) have the potential to overcome this shortfall by simultaneously estimating the probability of occurrence and the probability a species is detected (MacKenzie et al. 2002, MacKenzie et al. 2006, Royle and Dorazio 2008, Rota et al. In Review) These approaches rely on detection/non detection data, and require multiple surveys at a site to estimate the probability of detection (MacKenzie et al. 2002) Occupancy models can accommodate covariates associated with detection probability and the probability of occurrence and can be esti mated with both a classical likelihood based (MacKenzie et al. 2006) or a Bayesian (Royle and Dorazio 2008) approach Occupancy modeling approaches have gained en ormously in popularity (Marsh and Trenham 2008) and have the potential to improve the predictive performance of SDMs However, to our knowledge, they have yet to be assessed r elative to other approaches in a species distribution modeling context where the goal is to maximize the predictive performance of models across geographic regions Here, we use occupancy modeling approaches to create SDMs for breeding forest birds that are imperfectly detected We contrast an occupancy modeling approach with other commonly used SDM approaches: logistic regression and maximum entropy (Phillips et al. 2006) By explicitly accounting for imperfect detection when
34 building SDMs, we expect that occupancy m odels should produce more accurate estimates of habitat relationships and improve predictive performance relative to approaches that do not account for detection bias (e.g., logistic regression) or use a subset of the data to do so (e.g., presence only app roaches). Methods Field Methods Our analysis draws on a spatially and temporally extensive effort to estimate geographical distributions and monitor population trends of landbirds breeding in western Montana and northern Idaho. Since 1994, surveys coordi nated through the Northern Region Landbird Monitoring Program (NRLMP) have monitored all diurnal landbirds that can be detected using a single point count methodology (Hutto and Young 1999) The entire NRLMP dataset consists of 482 permanently marked transects stratified across nine National Forests and three cooperating agencies (Fig. 2 1). Each transect consists of 10 permanently marked points (sites hereafter), spaced approximately 300 m apart. Standard 10 minute, 100 meter radius point count surveys were conducted at each permanently marked site. During each survey, all bi rds seen or heard were recorded by trained observers. A subset of transects were surveyed every year since 1994, with the exception of 1997, 1999, 2001, and 2006. Each site was only surveyed once per breeding season (Hutto and Young 2002) During all years, transects were surveyed from late May to mid July. In many years, each 10 minute survey was divided into two 5 minute sampling intervals. When point counts were divided into shorter sampling intervals, observers noted the sampling interval in which a species was first detected. By doing so, observers were effectively ( sensu MacKenzie and Royle 2005, Rota e t al. In
35 Review ) meaning they were only sampling for a species until it was first detected. We used this information to estimate and account for detectability in SDMs. Transect and Species Selection We initially refined our analysis to transects and s ites with reliable GIS data Occupancy modeling approaches require multiple surveys at each site in order to estimate detection probability. Thus, in order to build SDMs that incorporate imperfect detection, we further refined our analysis to those sites where surveys were subdivided into sho rter sampling intervals. Finally, we removed from the analysis surveys conducted during 2003 and 2005, because of restricted geographic sampling. This resulted in 803 7 individual point counts, located on 268 tr ans ects (Fig. 2 1), spanning 5 different yea rs (1994, 1995, 2004, 2007, and 2008) We then split the refined data into two groups: training and validation data. We randomly selected 2/3 of the transects (5487 point counts located on 181 transects) for model training, i.e., we developed SDMs using these data only. The remaining 1/3 of the data (2550 point counts located on 87 transects) was used to validate SDMs. With the fixed effects estimated from the training data, we made predictions to the validation data (see below). For this analysis, we chose species with high and low relative abundance and detection probability. To ensure enough data for developing SDMs, we limited our pool of candidate species to those detected on at least 5% of point counts, for a total of 39 species. We initially pa rtitioned candidate species into two groups, relatively abundant and relatively uncommon, based on the proportion of point counts they were detected on To estimate detection probability, we fit intercept only occupancy models (including random effects; s ee below) to each species. From the relatively abundant group, we then selected species with the highest and lowest estimated detection probability, excluding Pine Siskin ( Carduelis pinus ) because of their nomadic nature. This resulted in selecting Towns Dendroica townsendii ), which
36 is relatively abundant in the study area and highly detectable and Golden crowned Kinglet ( Regulus satrapa ), which is relatively abundant in the study area but with a relatively low detection probability From the relatively uncommon group, we selected the Olive sided Flycatcher ( Contopus cooperi ), which was the species in that group with the highest estimated detection probability. We additionally selected the Black headed Grosbeak ( Pheucticus melanocephalus ) We selected this species because of a relatively low estimated detection probability (approximately 40% for a 10 minute survey) and because of a close correspondence with the estimated detection probability from an independent dataset (Rota et al. In Review ) Habitat and Detection Covariates We modeled species distributions as a function of several habitat covariates. We selected unique habitat covariates for each species, based on published accounts of habitat preference. nal forest (Wright et al. 1998) We thus s elected habitat covariates to reflect canopy cover and tree diameter at breast height (DBH). Olive sided flycatchers are associated with open canopy forest, and frequently breed along streams (Altman and Sallabanks 2000) so we selected habitat covariates to reflect canopy cover and distance to stream. Golden crowned Kinglets are strongly associated with subalpine forest, show a preference for breeding close to streams, and may show a positive relationship with tree size (Ingold and Galati 1997) We thus selected habitat covariates to reflect the presence of subalpine forest, distance to streams, and tree DBH. Based on known associations of Black headed Grosbeaks with deciduous vegetation, streams, and forest ga ps and openings (Hill 1995) we selected habitat variables reflecting the presence of deciduous forest, proximity to streams, and canopy cover. In addition to these unique habitat variables for each species, we modeled the probability occurrence as a function of elevation and survey date for all species. Elevation is an indirect gradient that is likely to strongly influence the probability of
37 occurrence (Austin 2002) Survey date is likely to influence to probability of occurrence, especially early in the season as species are making settlement decisions. We explored both linear and quadratic relationships with elevation and survey date. We derived all habitat covariates from Global Information System (GIS) based vegetation measures. All original GIS layers for canopy cover DBH, and forest type were 1 5 m resolution digital land co ver map s developed by the U nited States Forest Service Northern Region Vegetation Mapping Program (USFS R1 VMP) based on Landsat TM imagery and aerial photo interpretation (Brewer et al. 2004) We derived habitat variables from three R1VMP GIS layers: Tree diameter, Canopy Cover, and Life Form. The origin al R1VMP Tree diameter layer described tree size with four size categories and we used a Principal Component Analysis (PCA) to reduce the number of DBH variables to two. The original R1VMP Canopy Cover layer described canopy cover using three categories. We again used PCA to reduce the number of canopy cover variables to two. In both cases, one principal component reflected a linear gradient of canopy cover or DBH, whereas the other component reflected a non linear gradient (high factor loadings on inter mediate). The original R1VMP Life Form layer describes the relative canopy cover of several vegetative communities in each cell. From this layer, we derived new layers describing the presence/absence of subalpine and deciduous forest in each cell and in t he surrounding 1 km landscape. We derived stream distances from the United States Geological Survey (USGS) National Hydrography Dataset, and we derived elevation from a 30 m resolution D igital E levation Model We aggregated all GIS layers to a common 200 meter resolution prior to analysis to reflect the grain of our sampling unit (100m radius counts). In addition to habitat covariates, we selected several variables that may influence the probability that a species is detected. For all species, we modeled detection probability as a
38 function of date of survey, time of survey, wind speed, and sky cover/precipitation. We also included quadratic effects of date of survey and time of survey. We initially attempted to include an observer random effect, since a ccounting for observer bias has been demonstrated to improve model fit with another large scale avian monitoring dataset (Sauer et al. 1994) However, observations were sparse for the Olive sided Flycatcher and Black headed Grosbeak, leading to difficulties with model c onvergence. We thus excluded a random observer effect from analysis. Logistic Regression Models We fit logistic regression models to detection/non detection data for the four species considered. We used a repeated measures logistic regression design (Liang and Zeger 1986) because many sites had multiple surveys across years. For example, some sites were surveyed in 1994, 1995, 2004, 2007, and 2008. We estimated regression parameters using Generalized Estimating Equations (GEE), as implemented in PROC GENMOD (SAS Institute 2008) For each species, we modeled the probability of occurrence as a function of the habita t variables described above. For each species, we used manual backwards selection based on QIC (Pan 2001) and 95% confidence intervals of parameters to arrive at a parsimonious model (cf. Thogmartin et al. 2004) We first fit full models with all habitat variables included. We then removed the habitat variable with the smallest estimated regression coef ficient whose 95% confidence interval overlapped zero. If removal of this variable resulted in a decreased QIC score, it remained out of the model and we removed the variable with the next smallest estimated regression coefficient whose 95% confidence int erval overlapped zero. We continued in this manner until either removal of a parameter no longer resulted in a decreased QIC, or until no estimated regression coefficient had 95% confidence intervals that overlapped zero. We then used this final model to make predictions and compare across models.
39 Hierarchical Occupancy Models We contrasted logistic regression models with hierarchical occupancy models to build SDMs for all species considered. We fit hierarchical models using Bayesian methods, which we im plemented in WinBUGS 1.4 (Spiegelhalter et al. 2003) Hierarchical occupancy models contain explicit models for both the latent occupancy state, which is the ecological process of interest, and the observations conditional on the latent occupancy state (Royle and Dorazio 2008) The process model describes the latent occupancy state and is specified as: z i i ), where z i is the latent occupancy state of site i ( ; a site is occupied if z i = 1, a site is unocc upied of z i i is the probability site i i as a function of both fixed and random effects: i 0 + cov x i + t r 0 is an intercept term, cov is a vector of regression parameters associated with the covariates, x i is a vector of habitat covariates at site i t is a random effect of transect t r is a random effect of year r by transect. This can be an effective way to deal with potential spatial autocorrelation (Keit t et al. 2002) In our case, if spatial autocorrelation was pervasive in the data, we expected it to occur within transects rather than among transects, because on average transects are >8 km apart. Additionally, we include a random year effect becaus e many sites were surveyed in multiple years. each site by year and limits potential pseudoreplication from temporally repeated measures The observation model describes variation in detections, conditi onal of the latent ( sensu MacKenzie and Royle 2005, Rota et al. In Review ) where we only surveyed for a species until it was first detected, for
40 a m aximum of two surveys. Such designs are typical of many point count based surveys (Ralph et al. 1993) The likelihood function for detection probability with a removal sampling protocol, conditional on the latent occupancy state, is: where p i is the probability of detecting a species at site i y i is a binary indicator of whether a species was detected at site i ( y i =1) or not ( y i =0) and j i is the survey during which a species was detected at site i (note that if no detections were made, j i = J = 2, the maximum number of surveys at a site). We further modeled p i as a linear function of covariates likely to influence detection probability as: logit( p i 0 + cov v i 0 is an intercept term, cov is a vector of regression parameters associated with the covariates, and v i is a vector of detection covariates at site i For all fixed effects, we specified vague normal prior distributions with mean 0 and variance 1000. We specified all random effects to have a mean zero and variance (for the random transect effect) or (for the random year effect). We specified vague inverse gamma prior distributions for these variance hyperparameters, with mean 1 and variance 1000. We estimated posterior distributions from two Markov chains, each with 100,000 iterations, discarding the first 50,000 as burn in and drawing every fifth iteration thereafter. For hierarchical occupancy models, we used the model selection strategy described by Kuo and Mallick (1998) to arrive at a parsimonious model as an alternative to DIC, which is limited for hierarchical models with two or more levels (Millar 2009) This approach estimates the posterior probability of all possible combinations of fixed effects in a model. This involves fitting an additional inclusion parameter, w k for each fixed effect k Each inclusion parameter is
41 assu k which is the probability that fixed effect k will be included in the model. We specified an impartial prior distribution on all inclusion parameters as: w k ~ Bernoulli(0.5). We then estimated the pos terior probability of each model directly from the Markov Chain Monte Carlo (MCMC) output by calculating the proportion of times each configuration of inclusion parameters, w k equaled 1. We selected the model with the largest posterior probability as our parsimonious model. Maximum Entropy Models Finally, we used a maximum entropy machine learning approach to build SDMs for all species. Maximum entropy approaches rely on presence only data to model the distributions of species, avoiding the problem of imperfect detection by not considering sites d uring the modeling process where species were not detected. This approach has recent been shown to have high predictive performance relative to other presence only approaches (Elith et al. 2006) known constraints on species distribution s, but is closest to maximum entropy (i.e., the approximated probability distribution is as close to uniform as possible). Environmental variables at known that the expected val ue of an environmental variable under the estimated distribution must match its empirical average (Phillips et al. 2006) We used Maxent (Phillips et al. 2006) to model species distributions from presence only data. To estimate species distributions with Maxent, we combined a set of GI S layers unique to each species that described several biotic and abiotic gradients likely to influence the probability occurrence (as described above), which we then overlaid with locations of known species
42 occurrence. We then used two approaches to arri ve at a mean suitability across years. First, we segregated the presence only points by year, building separate models for each year. We then estimated mean suitability by simply averaging the estimate of suitability across years. Second, we fit a singl e model to all presence points, regardless of the year it was collected. These two approaches provided almost identical estimates of relative suitability, so we only report the average estimate of suitability across years. Maxent uses a jackknife test to estimate the relative importance of variables in a model. This test is performed by running a series of models, with each run omitting a single habitat variable in turn. Additionally, a series of models are run which include each habitat variable used i n isolation. The estimate of relative importance is then obtained by comparing model gain when a variable is omitted and when only that variable is included. We use the results of the jackknife test to qualitatively compare habitat relationships estimate d by Maxent to habitat relationships estimated by logistic and occupancy models. Maxent fits highly non linear relationships to the data, and direct comparison of estimated habitat effects between Maxent and logistic and occupancy models is not straightfo rward. Comparing Predictive Performance Once we arrived at a parsimonious model for each SDM approach, we compared their predictive performance by calculating the Area Under the Curve (AUC) of a Receiver Operating Characteristic (ROC) plot (Fielding and Bell 1997) SDM predictions are in the form of a probability of detection (for logistic and occupancy models) or relative suitability (for maximum e ntropy models). The ROC plot is a threshold independent measure of model predictive performance, because it evaluates predicted versus observed detections at all possible thresholds. The curve is constructed by plotting the fraction of true positive pred ictions on the y axis against the fraction of false positive predictions on the x axis for every threshold value. The AUC is
43 predictions are no better than random, while a n AUC score >0. 50 indicates successively greater predictive performance. We used the ROCR package (Sing et al. 2005) in the R statistical package (R Development Core Team 2008) to calculate AUC. This measure is useful for contrasting different model types that vary subtly i n interpretation (Elith et al. 2006) such as those contrasted here. We used estimates of fixed effects from the training data to make predictions to the validation data. Since presence/absence is only imperfectly assessed in our validation data, our predictions were interpreted as the probability of detection/non detectio n to new sites. We estimated the probability of detecting a species with logistic regression models by multiplying the estimated fixed habitat effect by the observed habitat variable at each new location. However, because logistic regression approaches d o not explicitly account for imperfect detection, the probability of occurrence is confounded with detection probability. Therefore, to ensure we were using the same currency to compare models, we additionally estimated the probability of detection at eac h new location with occupancy modeling approaches. We first estimated the probability of occurrence by multiplying the estimated fixed habitat effect by the observed habitat variable at each new location. We then estimated the probability of detecting a species at each site by multiplying the estimate of fixed detectability effects by the observed detectability variables at each new location. We finally multiplied the predicted probability of occurrence by the predicted probability of detecting a species at least once to arrive at a prediction of detection or non detection at a site. Finally, we compared the index of relative suitability generated by Maxent with observed detections/non detections at the validation sites.
44 Results Each modeling approach es timated slightly different habitat relationships. Generally, the three approaches all agreed on what habitat variables had the strongest influence on the probability of occurrence (Table 2 1 and Figure 2 2 ). For example, an elevation term entered all of the logistic and occupancy models except the final Olive sided Flycatcher occupancy model, and was a consistently influential variable in Maxent models. Additionally, for several habitat variables, there was consistent agreement between models. For examp were correlated with canopy cover, Golden crowned Kinglets were correlated with distance to stream and presence of subalpine forest, and Black headed Grosbeaks were correlated with canopy cover for all three modeling approaches. T here was, however, considerable disagreement over the importance of some estimated habitat relationships, particularly for the Olive sided Flycatcher (Table 2 1 and Figure 2 2). For example, Maxent estimated strong correlations between Olive sided Flycatcher occurrence and model for the Olive sided Flycatcher. Similarly, both logistic and occupancy model s predicted strong negative correlations between the presence of deciduous trees and Black headed Grosbeak occurrence, while this was the least influential variable in Maxent models. This disagreement in estimated habitat relationships led to considerably different predictive maps between the Maxent approach and the logistic and occupancy modeling approaches (Fig. 2 3). Occupancy models and logistic regression models tended to have similar fixed effects coefficients (Table 2 1). All estimated habitat rela tionships except one were in the same direction and of similar magnitude. Most of the same variables entered both logistic and occupancy modes, although occupancy models always had the same or fewer number of
45 variables. Estimates of habitat relationships were not consistently larger or smaller for either the logistic or occupancy modeling approaches. Detection probability decreased as a function of site specific variables for the crowned Kinglet (Table 2 3). Detection probab ility was best modeled as constant for the Olive sided Flycatcher and the Black headed Grosbeak, the two least abundant species considered. The intercept of estimated detection probability (detection probability when all other covariates equals 0) during a 10 minute point count ranged from 0.81 for the Black Species distribution models based on occupancy modeling and logistic regression consistently outperformed maximum entropy models, in terms of AUC. H owever, there was no clear distinction in AUC between occupancy modeling and logistic regression models. The AUC score for the Olive sided Flycatcher was higher for logistic regression models than for occupancy models; for the other species considered, AU C scores were nearly identical for logistic and occupancy models (Table 2 3 ). Discussion Species distribution modeling approaches that included detection/non detection data consistently outperformed species distribution models that relied on presence only data. This result is consistent with earlier work (Brotons et al. 2004) that contrasted other presence only methods with lo gistic regression. While a recent overview of presence only SDM approaches suggests that Maxent performs well relative to other presence only methods (Elith et al. 2006), The consistent underperformance of Maxent relative to logistic and occupancy models suggests that approaches that rely on detection/non detection data should be used whenever the appropriate data are available.
46 A possible reason for the consistent under performance of Maxent models relative to logistic and occupancy models is its relianc e on presence only points. While presence only data provides information on where a species is likely to be found, it may not be reliable for determining habitat variables a species is likely to avoid. For example, in the training data, Black headed Gros beaks were only detected in 5 of the 208 sites that had deciduous forest in the surrounding 1 km landscape. This led to a strong negative correlation between occurrence of the Black headed Grosbeak and the presence of deciduous forest in the surround 1km landscape for both logistic and occupancy models (admittedly, the opposite relationship we were expecting). However, the effect of the variable was negligible in the Maxent models, presumably because of the paucity of detections associated with that habit at. This could lead to erroneous predictions to new sites if, for example, presence of a variable that is negatively correlated with the probability of occurrence co occurs with other variables that may positively correlate with occurrence probability. S urprisingly, we found no clear distinction between the predictive performance of either logistic regression or occupancy modeling approaches. Our a priori expectation was that, especially for species with low relative detection probability, a species dist ribution modeling approach that explicitly incorporated imperfect detection should improve predictive performance. Several recent simulation studies demonstrate that false absences, i.e., failing to detect a species that is in fact present, will result in biased estimates of habitat relationships when using logistic regression approaches (Tyre et al. 2003, Gu and Swihart 2004, Royle and Dorazio 2008). These simulations demonstrate that when false absences are included in logistic regression models of habi tat relationships, both the intercept and slope of the estimated relationship will be underestimated. However, inspection of the regression coefficients for the
47 logistic regression models shows no consistent difference in estimation of either the intercep t or slope of estimated habitat relationships. Additionally, another simulation study (Rota, unpublished data ) demonstrated that in some situations, occupancy models that account for imperfect detection will improve predictive performance of species distr ibution models relative to logistic regression, in terms of AUC. In these simulations, occupancy models outperformed logistic regression models when detection probability was heterogeneous across sites. However, when detection probability was constant ac ross sites, the predictive performance of logistic models was equivalent to occupancy models, despite under estimates of habitat relationships from logistic models. The predictive performance of logistic models is comparable to occupancy models when detec tion probability is constant across sites, even if it is less than 1, because the relative estimates of habitat relationships are maintained. For both the Black headed Grosbeak and the Olive sided Flycatcher, the final model held detection probability con stant across sites. This may partly explain the lack of gains in predictive performance of occupancy models for these species. While an occupancy modeling approach may have provided more accurate estimates of habitat relationships for these species, ther e still may have been no gain in predictive performance since the final model held detection probability constant across sites. Indeed, evaluation of regression coefficients for both occupancy and logistic models shows relative estimates of habitat relati onships to be similar. Another reason occupancy modeling approaches may have failed to improve predictive performance is the short temporal scale of the replicate surveys. Ten minute point counts were divided into two five minute sampling intervals, and these sampling intervals were used as replicate surveys to estimate detection probability. Little information relevant to estimating
48 detection probability may have been gained in such a short period. For example, estimated detection probability is the pr obability of detecting an individual, conditional on the individual being present during the sampling occasion. However, both the Black headed Grosbeak and Olive sided Flycatchers have relatively large territories, and temporary absences from occupied or otherwise suitable habitat may be confounded with true absences over such a small sampling window. Thus, estimated detection probability may be biased high, leading to similar performance between occupancy modeling and logistic regression approaches. Non etheless, for two species, parsimonious models did capture signals that date, time, and wind speed influenced detection probability. An important assumption of occupancy modeling approaches is that sites are closed to changes in occupancy between replicat e surveys (MacKenzie et al. 2002, Rota et al. In Review ). Occupancy models can be very sensitive to violations of closure (Rota et al. In Review ), and t he temporal scale of our replicate surveys minimized the length of time over which closure was assumed. However, there may be a trade off between conducting surveys over a short enough time period to minimize the likelihood of closure and conducting surveys over a long enough time period to collect adequate data on detection probability. Assessing the pot ential effect of such a trade off could be an important avenue of future research in occupancy modeling approaches. An additional assumption of occupancy modeling approaches is that there is no un modeled heterogeneity in detection probability (MacKenzie et al. 2006). While our long term dataset was uniquely designed to provide some information on detection probability (cf. Thogmartin et al. 2004), this assumption was almost certainly violated with our approach as several likely sources of heterogeneity i n detection probability were not modeled. For example,
49 since including a random observer effect led to difficulty with model convergence, we excluded it from analysis. Additionally, background noise (e.g., stream noise) and temperature at a survey site i s likely to influence detection probability. We excluded these variables from analysis, however, because this data was not consistently collected at all sites. Failure to model all sources of heterogeneity in detection probability may have contributed to the lack of improved predictive performance in occupancy models. However, it is not known how sensitive occupancy models are to violations of this assumption, or how violations of this assumption may bias estimates of detection probability and thus estim ated habitat relationships. This is an important direction for future research into the performance of occupancy models. While occupancy modeling approaches hold promise for improving estimates of habitat relationships and models of species distributions in some situations they may offer little difference in inference regarding behavioral and ecological questions (Fletcher 2009) or predictive performance of species distributions (Table 2 3). We are not implying that models that do not account for imperfect detection are better than those that do; rather, o ther issues of sampling design may be of greater importance for inference in some situations (see, e.g., Bart et al. 2004) .. Instead, we co ntend that the onus is on ecologists to address the detection issue, as well as other problems in sampling design, in rigorous ways to evaluate potential biases in inference. This application highlights, however, that practitioners may need to balance the additional conceptual and computational expense of an occupancy modeling approach relative to potential gains in improved predictions of species distributions.
50 Table 2 1. Regression coefficients (with SE and SD for logistic and occupancy models, respectively), on the logit scale, from logistic and occupancy species distribution models, for variables that were selected for the final model. An empty cell indicates that varia ble was not considered, and a indicates this variable was removed during the model selection procedure. Species/Model I ntercept C cpc1* C cpc2* D bhpc1 D bhpc2 S tream D ecid1k S ubalp E lev E lev 2 D ate D ate 2 Logistic 0.44 (0.06) 0.23 (0.08) 0.81 (0.11) 0.56 (0.09) 0.15 (0.09) 0.27 (0.04) 0.43 (0.04) 0.42 (0.04) 0.47 (0.03) Occupancy 0.41 (0.22) 1.12 (0.15) 0.42 (0.12) 0.36 (0.11) 0.47 (0.12) 0.58 (0.08) 0.40 (0.10) 0.35 (0.09) Golden Crowned Kinglet Logistic 1.24 (0.07) 0.48 (0.09) 0.20 (0.04) 0.74 (0.08) 0.20 (0.04) 0.26 (0.04) 0.15 (0.04) 0.13 (0.04) Occupancy 1.14 (0.31) 0.54 (0.11) 0.20 (0.06) 0.55 (0.11) 0.13 (0.08) 0.31 (0.06) 0.03 (0.07) 0.15 (0.05) Black headed Grosbeak Logistic 2.83 (0.12) 1.39 (0.17) 0.84 (0.46) 0.79 (0.10) 0.19 (0.07) 0.14 (0.08) 0.21 (0.07) Occupancy 3.53 (0.33) 1.56 (0.24) 1.26 (0.60) 0.81 (0.15) Olive sided Flycatcher Logistic 1.50 (0.08) 1.68 (0.25) 0.24 (0.06) 0.42 (0.07) 0.31 (0.05) 0.43 (0.06) 0.34 (0.06) Occupancy 2.43 (0.23) 2.85 (0.59) 0.49 (0.12) 0.30 (0.08) Principle components of canopy cover for the surrounding 1 km landscape was used for the Olive sided Flycatcher.
51 Table 2 2. Regression coefficients (and SD), on the logit scale, for estimates of detection probability from occupancy models. All variables were considered for all species. A indicates that variable was removed during the model selection process. Species Intercept Date Date 2 Time Time 2 Wind Sky Warbler 1.84 (0.12) 0.53 (0.14) 0.44 (0.13) 0.10 (0.06) 0.22 (0.05) Golden crowned Kinglet 0.42 (0.11) 0.22 (0.11) 0.18 (0.08) Black headed Grosbeak 0.26 (0.26) Olive sided Flycatcher 1.01 (0.15) Table 2 3 Area Under the Curve (AUC) scores for each Species Distribution Modeling (SDM) approach. Species Logistic Occupancy Maxent 0.75 0.75 0.68 Golden crowned Kinglet 0.63 0.63 0.62 Black headed Grosbeak 0.71 0.71 0.62 Olive sided Flycatcher 0.70 0.66 0.56
52 Figure 2 1 Location of permanently marked Northern Region Landbird Monitoring Program transects used to estimate the distribution of forest birds. The black dots represent transects selected for building species distribution models (SDMs), and the open rectangles represent transects selected for validating SDMs
53 Figure 2 2 Results of jackknife test of relative variable importance from Maxent species distribution models. The dark blue bars indicate the gain from building species distribution models with only that variable. The light blue bars indicate the gain from
54 building species distribution models without that variable. The red bar indi cates the gain from building a species distribution model with all variables. High gain when a single variable is considered indicates a strong correlation with species occurrence. Low gain when a variable is excluded indicates that variable has unique i nformation not included in other variables.
55 Figure 2 3 Predicted habitat suitability of Olive sided Flycatcher ( Contopus cooperi ) in western Montana and northern Idaho. A ) T he probability of occurrence estimated from logistic regression models. B) Th e probability of occurrence estimated from occupancy models. C)E stimates of relative suitability from Maxent, a presence only species distribution modeling approach. Darker gray indicates a greater suitability. Note the different interpretation between the occupancy and logistic regression predictions (probabilistic) and the presence only (relative suitability) models.
56 APPENDIX A S IMULATING THE DISTRI BUTION OF THE LIKELI HOOD RATIO TEST STATISTIC We simulated the distribution of the likelihood ratio (LR) test statistic by randomly generating data under the null hypothesis of closure fitting open and closed models to the simulated data, and conducting a likelihood ratio comparison. Random data was generated using the maximum l ikelihood estimate (MLE) of parameters from closed models. We repeated this process an arbitrarily large number of times ( N =10,000) for each LR test We estimated p values from the simulated distribution by determining the proportion of simulated test st atistics that exceeded the observed LR test statistic. likelihoods of open and closed models, which we then used to calculate LR test statistics. Occasionally, this technique returned negative values of LR test statistics when the test statist ic should have been zero. We determined that this occurred because optim cannot evaluate a parameter at zero when searching for MLEs of open models. The Nelder Mead and BFGS algorithms contained within optim both search over a parameter space between use an inverse logit transformation to constrain the parameter space between 0 and 1. In some logit 1. When this happens, optim returns an incorrect MLE, which can result in negative LR test statistics. Thus, we determined that when using numeric optimization methods such as optim, negative values of a LR test statistic can always be treated as zeros.
57 APPENDIX B BIASES IN DETECTION PROBABILITY, P RESULTING FROM VIOLA TION OF THE CLOSURE ASSUMPTION Changes in site occupancy between primary sampling periods leads to under estimates of p for closed models (Fig. B 1, B 2). Bias in estimates of p is more pr onounced for standard occupancy protocols than for closed RD protocols. Estimates of p from closed models are most sensitive to extinction events when occupancy rates are high (Fig. B 1), and colonization events when occupancy rates are low (Fig. B 2). Figure B 1 Mean estimated probability of detection, p occupancy (MacKenzie et al. 2002), fixed replicate RD and removal RD models, each calculated from 1000 replicate simulations. A) Extinction only, p =0.70. B) Extinction only, p =0.30. C) Colonization only, p =0.70. D) Colonization only, p =0.30. All data were simulated using two primary sampling periods, with four secondary periods nested within each primary period.
58 Figure B 2 Mean estimated probability of detection, p occupancy (MacKenzie et al. 2002), fixed replicate RD and removal RD models, each calculated from 1000 replicate simulations. ) Extinction only, p =0.70. B) Extinction only, p =0.30. C) Colonization only, p =0.70. D) Coloni zation only, p =0.30 All data were simulated using two primary sampling periods, with four secondary periods nested within each primary period.
59 APPENDIX C S TATISTICAL POWER OF LIKELIHOOD RATIO TEST The likelihood ratio test of closure measures the relative support for closed models (H 0 ) compared to open models (H 1 ). In this context, closed models are a restricted version of open models: H 0 (i.e., no extinction or colonization) H 1 : not H 0 (i.e., extinction or colonization > 0). The likelihood ratio test statistic, G 2 is calculated as: G 2 = ratio of two nested models calculated using the maximum likelihood estimate s (MLE) of parameters under H 1 and H 0 Under the assumptions of H 1 the limiting distribution of G 2 is non central chi centrality 1 and H 0 The power of rejecting H 0 when H 1 is true is the probab ility of drawing a random value from the limiting distribution of H 1 level test: where is the central chi level test. Calculation of the non computed by fitting open and closed models to data expected under the assumptio ns of H 1 Calculating Expected Data To calculate expected data, we first determine the number of sites that fall within each of 2 T mutually exclusive patterns of detection and non detection, where T is the number of primary
60 sampling periods. For example, t wo season fixed replicate and removal RD protocols imply 2 2 = 4 mutually exclusive events (Table C 1). The expected number of sites where each mutually exclusive event is observed is calculated by multiplying the number of sites, N by the probability of observing each mutually exclusive event (Table C 1). With a fixed replicate RD protocol, we next determine the expected number of detections at sites where detections occurred. Let the random variables y 1 y 2 y T denote the number of detections observed during primary sampling period t ( t T ): where J is the number of secondary sampling periods within each primary sampling period. For sites where a species is detected during primary sampling period t w e can show that: and for sites where a species is not detected during primary sampling period t : With a removal RD protocol, we determine the expected number of surveys at sites where detections occurred. Let the random variables y 1 y 2 y T denote a sequence of binary indicators of whether a species is detected ( y =1) or not ( y =0): and let j 1 j 2 j T denote a corresponding sequence of sampling intervals in which a species is detected: For sites where a species is detected during primary sampling period t w e can show that:
61 and for sites where a species is not detected during primary sampling period t : Calculating Noncentrality The final step in computing the statistical power is to fit open and closed models to the expected data and to calculate the likelihood ratio (LR) test statistic using these fits. The value of this LR test statistic is equ replicate RD protocol, the likelihood function under H 1 is: where: and n 11 = the expected number of sites with at least one detection at primary sampling periods t =1 and t =2, n 10 = the expected number of sites with at least one detection at primary sampling period t =1 and no detections at primary sampling period t =2,
62 n 01 = the expected number of sites with no detections at primary sampling period t =1 and at least one detection at primary sampling period t =2, n 00 = the expected number of sites with no detections at primary sampling periods t =1 and t =2. For a removal RD protocol, the likelihood function under H 1 is: where: and where n 11 n 10 n 01 and n 00 are defined exactly as they were defined for the fixed replicate RD protocol. For both fixed function under H 0
63 Table C 1 The probability of observing each of four mutually exclusive patterns of detections and non detection for a two season, fixed replicate or removal RD sampling protocol. Note that the probability of all four events sums to 1. Time 2 Detected Not Detected Time 1 Detected Not Detected
64 APPENDIX D SCIENTIFIC NAMES AND BODY MASS Table D 1 S cientific names and body mass Common Name Scientific Name Mass (g) 1 Dataset 2 Mourning Dove Zenaida macroura 119.00 R Yellow bellied Sapsucker Sphyrapicus varius 50.30 HB Downy Woodpecker Picoides pubescens 27.00 R Northern Flicker Colaptes auratus 142.00 R Western Wood Pewee Contopus sordidulus 12.80 R Yellow bellied Flycatcher Empidonax flaviventris 11.60 HB Willow Flycatcher Empidonax traillii 13.40 R Least Flycatcher Empidonax minimus 10.30 R Western Kingbird Tyrannus verticalis 39.60 R Eastern Kingbird Tyrannus tyrannus 43.60 R Blue headed Vireo Vireo solitaries 16.60 HB Warbling Vireo Vireo gilvus 14.80 R Red eyed Vireo Vireo olivaceus 16.70 HB Black billed Magpie Pica hudsonia 177.50 R Tree Swallow Tachycineta bicolor 20.10 R Black capped Chickadee Poecile atricapillus 10.80 R/HB White breasted Nuthatch Sitta carolinensis 21.10 HB Brown Creeper Certhia americana 8.40 HB House Wren Troglodytes aedon 10.90 R Winter Wren Troglodytes troglodytes 8.90 HB Golden crowned Kinglet Regulus satrapa 6.20 HB Catharus ustulatus 30.80 HB Hermit Thrush Catharus guttatus 31.00 HB American Robin Turdus migratorius 77.30 R Gray Catbird Dumetella carolinensis 36.90 R European Starling Sturnus vulgaris 82.30 R Cedar Waxwing Bombycilla cedrorum 31.85 R Yellow Warbler Dendroica petechia 9.50 R Magnolia Warbler Dendroica magnolia 8.70 HB Black throated Blue Warbler Dendroica caerulescens 10.15 HB Yellow rumped Warbler Dendroica coronate 12.55 HB Black throated Green Warbler Dendroica virens 8.80 HB Blackburnian Warbler Dendroica fusca 9.75 HB Ovenbird Seiurus aurocapilla 21.00 HB Common Yellowthroat Geothlypis trichas 10.10 R Yellow breasted Chat Icteria virens 25.30 R Spotted Towhee Pipilo maculatus 40.50 R Song Sparrow Melospiza melodia 20.75 R Dark eyed Junco Junco hyemalis 19.60 HB Black headed Grosbeak Pheucticus melanocephalus 42.00 R Red winged Blackbird Agelaius phoeniceus 52.55 R Brown headed Cowbird Molothrus ater 43.90 R
65 Table D 1. Continued Common Name Scientific Name Mass (g) 1 Datase t 2 Icterus bullockii 33.60 R House Finch Carpodacus mexicanus 21.40 R American Goldfinch Carduelis tristis 12.90 R 1 Body masses were obtained from Dunning (1993). 2 R = riparian dataset, HB = Hubbard Brook dataset.
66 LIST OF REFERENCES Alldredge, M. W., K. H. Pollock, T. R. Simons, J. A. Collazo, and S. A. Shriner. 2007. Time of detection method for estimating abundance from point count surveys. Auk 124 :653 664. Altman, B., and R. Sallabanks. 2000. Olive sided Flycatcher ( Contopus cooperi ). in A. Poole, editor. The Birds of North America Online. Cornell Lab of Ornithology, Ithaca; Retrieved from the Birds of Nort h America Online: http://bna.birds.cornell.edu/bna/species/502 Austin, M. P. 2002. Spatial prediction of species distribution: A n interface between ecological theory and statistical modelling Ecological Modelling 157 :101 118. Azuma, D. L., J. A. Baldwin, and B. R. Noon. 1990. Estimating the occupancy of spotted owl habitat areas by sampling and adjusting for bias. General Technical Report PSW 124, U.S. Dept. of Agriculture, Forest Service, Pa cific Southwest Research Station, Berkeley. Bailey, L. L., T. R. Simons, and K. H. Pollock. 2004. Estimating site occupancy and species detection probability parameters for terrestrial salamanders. Ecological Applications 14 :692 702. Ball, L. C., P. F. Doh erty, and M. W. McDonald. 2005. An occupancy modeling approach to evaluating a Palm Springs ground squirrel habitat model. Journal of Wildlife Management 69 :894 904. Bart, J., K. P. Burnham, E. H. Dunn, C. M. Francis, and C. J. Ralph. 2004. Goals and strat egies for estimating trends in landbird abundance. Journal of Wildlife Management 68 :611 626. Betts, M. G., N. L. Rodenhouse, T S. Sillett, P. J. Doran, and R. T. Holmes. 2008. Dynamic occupancy models reveal within breeding season movement up a habitat quality gradient by a migratory songbird. Ecography 31 : 592 600. Bowman, J. 2003. Is dispersal distance of birds proportional to territory size? Canadian Journal of Zoolo gy 81 :195 202. Brewer, C. K., D. Berglund, J. A. Barber, and R. Bush. 2004. Northern Region Vegetation Mapping Project Summary Report and Spatial Datasets. in U. S. F. Service, editor. Brotons, L., W. Thuiller, M. B. Araujo, and A. H. Hirzel. 2004. Presence absence versus presence only modelling methods for predicting bird habitat suitability. Ecography 27 :437 448. Burnham, K. P., and D. R. Anderson. 2002. Model selection and multimodel inference: A practical information theoretic approach. 2nd Edition. Spri nger Verlag, New York.
67 Carroll, C., and D. S. Johnson. 2008. The importance of being spatial (and reserved): Assessing northern spotted owl habitat relationships with hierarchical Bayesian models. Conservation Biology 22 :1026 1036. Clark, J. S. 2005. Why environmental scientists are becoming Bayesians. Ecology Letters 8 :2 14. Dunning, J. B. 1993. CRC handbook of avian body masses. CRC Press, Boca Raton, Fl. Elith, J., C. H. Graham, R. P. Anderson, M. Dudik, S. Ferrier, A. Guisan, R. J. Hijmans, F. Huett mann, J. R. Leathwick, A. Lehmann, J. Li, L. G. Lohmann, B. A. Loiselle, G. Manion, C. Moritz, M. Nakamura, Y. Nakazawa, J. M. Overton, A. T. Peterson, S. J. Phillips, K. Richardson, R. Scachetti Pereira, R. E. Schapire, J. Soberon, S. Williams, M. S. Wisz and N. E. Zimmermann. 2006. Novel methods improve prediction of species' distributions from occurrence data. Ecography 29 :129 151. Farnsworth, G. L., K. H. Pollock, J. D. Nichols, T. R. Simons, J. E. Hines, and J. R. Sauer. 2002. A removal model for esti mating detection probabilities from point count surveys. Auk 119 :414 425. Fernandez, N., M. Delibes, and F. Palomares. 2006. Landscape evaluation in conservation: Molecular sampling and habitat modeling for the Iberian lynx. Ecological Applications 16 :1037 1049. Fielding, A. H., and J. F. Bell. 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24 :38 49. Fletcher, R. J. 2009. Does attraction to conspecifics explain the patch size effect? An experimental test. Oikos: In Press Fletcher, R. J. and R. L. Hutto. 2008. Partitioning the multi scale effects of human activity on the occurrence of riparian forest birds. Landscape Ecology 23 :727 739. Fletcher, R. J., R. R. Koford, and D. A. Seaman. 2006. Critical demographic parameters for declining songbirds breeding in restored grasslands. Journal of Wildlife Management 70 :145 157. Geissler, P. H. and M. R. Fuller. 1987. Estimation of the Proportion of an Area Occupied by an Animal Spec ies. Proceedings of the Section on Survey Research Methods of the American Statistical Association 1986 :533 538. Gu, W. D. and R. K. Swihart. 2004. Absent or undetected? Effects of non detection of species occurrence on wildlife habitat models. Biological Conservation 116 :195 203. Guisan, A., A. Lehmann, S. Ferrier, M. Austin, J. M. C. Overton, R. Aspinall, and T. Hastie. 2006. Making better biogeographical predictions of species distributions. Journal of Applied Ecology 43 : 386 392.
68 Guisan, A., and W. T huiller. 2005. Predicting species distribution: offering more than simple habitat models. Ecology Letters 8 :993 1009. Guisan, A., and N. E. Zimmermann. 2000. Predictive habitat distribution models in ecology. Ecological Modelling 135 :147 186. Hamer, A. J., S. J. Lane, and M. J. Mahony. 2008. Movement patterns of adult Green and Golden Bell Frogs Litoria aurea and the implications for conservation management. Journal of Herpetology 42 :397 407. Hanski, I. 1994. A p ractical m odel of m etapopulation d ynamics. Jo urnal of Animal Ecology 63 :151 162. Hill, G. E. 1995. Black headed Grosbeak (Pheucticus melanocephalus). in A. Poole, editor. The Birds of North America Online. Cornell Lab of Ornithology, Ithaca; Retrieved from the Birds of North America Online: http://bna.birds.cornell.edu/bna/species/143 Hutto, R. L., and J. S. Young. 1999. Habitat Relationships of Landbirds in the Northern Region, USDA Forest Service. General Technical Report RMRS GTR 32, USDA Forest Service, Ogden. Hutto, R. L., and J. S. Young. 2002. Regional landbird monitoring: perspectives from the Northern Rocky Mountains. Wildlife Society Bulletin 30 :738 750. Ingold, J. L., and R. Galati. 1997. Golden crowned Kinglet (Regulus satrapa). in A. Poole, editor. The Birds of North America Online. Cornell Lab of Ornithology, Ithaca; Retrieved from the Birds of North America Online: http://bna.birds/cornell.edu/bna/species/301 Keitt, T. H. O. N. Bjornstad, P. M. Dixon, and S. Citron Pousty. 2002. Accounting for spatial pattern when modeling organism environment interactions. Ecography 25 :616 625. Kendall, W. L. 1999. Robustness of closed capture recapture methods to violations of the closu re assumption. Ecology 80 :2517 2525. Klemp, S. 2003. Altitudinal dispersal within the breeding season in the Grey Wagtail Motacilla cinerea. Ibis 145 :509 511. Kroll, A. J., K. Risenhoover, T. McBride, E. Beach, B. J. Kernohan, J. Light, and J. Bach. 2008. Factors influencing stream occupancy and detection probability parameters of stream associated amphibians in commercial forests of Oregon and Washington, USA. Forest Ecology and Management 255 :3726 3735. Kuo, L., and B. Mallick. 1998. Variable selec tion for regression models. Sankhya: The Indian Journal of Statistics 60B :65 81. Lassalle, G., M. Beguer, L. Beaulaton, and E. Rochard. 2008. Diadromous fish conservation plans need to consider global warming issues: An approach using biogeographical mode ls. Biological Conservation 141 :1105 1118.
69 Liang, K. Y., and S. L. Zeger. 1986. Longitudinal Data Analysis Using Generalized Linear Models. Biometrika 73 :13 22. MacKenzie, D. I., J. D. Nichols, J. E. Hines, M. G. Knutson, and A. B. Franklin. 2003. Estimati ng site occupancy, colonization, and local extinction when a species is detected imperfectly. Ecology 84 :2200 2207. MacKenzie, D. I., J. D. Nichols, G. B. Lachman, S. Droege, J. A. Royle, and C. A. Langtimm. 2002. Estimating site occupancy rates when detec tion probabilities are less than one. Ecology 83 :2248 2255. MacKenzie, D. I., J. D. Nichols, J. A. Royle, K. H. Pollock, L. L. Bailey, and J. E. Hines. 2006. Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurrence. Elsevier Amsterdam. MacKenzie, D. I. and J. A. Royle. 2005. Designing occupancy studies: general advice and allocating survey effort. Journal of Applied Ecology 42 :1105 1114. Manly, B. F. J., L. L. McDonald, D. L. Thomas, T. L. McDonald, and W. P. Erickson. 2002. Resource selection by animals: Statistical design and analysis for field studies. 2 edition. Kluwer Academic Publishers, New York. Marsh, D. M. and P. C. Trenham. 2008. Current trends in plant and animal population monitoring. Conservation Biology 22 :647 655. Martin, T. G., B. A. Wintle, J. R. Rhodes, P. M. Kuhnert, S. A. Field, S. J. Low Choy, A. J. Tyre, and H. P. Possingham. 2005. Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecology Letters 8 : 1235 1246. Millar, R. B. 2009. Comparison of hierarchical Bayesian models for overdispersed count data using DIC and Bayes' factors. Biometrics In Press :DOI: 10.1111/j.1541 0420.2008.01162.x. Moilanen, A. 2002. Implications of empirical data quality to met apopulation model parameter estimation and application. Oikos 96 :516 530. Pan, W. 2001. Akaike's information criterion in generalized estimating equations. Biometrics 57 :120 125. Phillips, S. J., R. P. Anderson, and R. E. Schapire. 2006. Maximum entropy mo deling of species geographic distributions. Ecological Modelling 190 :231 259. Pollock, K. H. 1982. A capture recapture design robust to unequal probability of capture. Journal of Wildlife Management 46 :752 757. R Development Core Team. 2008. R: A lang uage and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3 900051 07 0, URL http://www.R project.org.
70 Ralph, C. J., G. R. Geupel, P. Pyle, T. E. Martin, and D. F. DeSante. 1993. Handbook of Field Method s for Monitoring Landbirds, General Technical Report PSW GTR 144. USFS Pacific Southwest Research Station, Albany. Rota, C. T., R. J. Fletcher, R. M. Dorazio, and M. G. Betts. In Review Occupancy estimation and the closure assumption. Royle, J. A., and R. M. Dorazio. 2008. Hierarchical modeling and inference in ecology: The analysis of data from populations, metapopulations, and communities. Elsevier, Amsterdam. Royle, J. A., M. Kery, R. Gautier, and H. Schmid. 2007. Hierarchical spatial models of ab undance and occurrence from imperfect survey data. Ecological Monographs 77 : 465 481. SAS Institute 2008. SAS software Version 9.2 of the SAS System for Windows. SAS and all other SAS Institute Inc. product or service names are registered trademarks o r trademarks of SAS Institute Inc., Cary, NC, USA. Sauer, J. R., B. G. Peterjohn, and W. A. Link. 1994. Observer Differences in the North American Breeding Bird Survey. AUK 111 :50 62. Self, S. G., and K Y. Liang. 1987. Asymptotic properties of maximum li kelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association 82 :605 610. Sing, T., O. Sander, N. Beerenwinkel, and T. Lengauer. 2005. ROCR: visualizing classifier performance in R. Bioinforma tics 21 :3940 3941. Spiegelhalter, D. J., A. Thomas, N. Best, and D. Lunn. 2003. WinBUGS User Manual, Version 1.4. Stauffer, H. B., C. J. Ralph, and S. L. Miller. 2004. Ranking habitat for marbled murrelets: New conservation approach for species with uncert ain detection. Ecological Applications 14 :1374 1383. Thogmartin, W. E., J. R. Sauer, and M. G. Knutson. 2004. A hierarchical spatial model of avian abundance with application to Cerulean Warblers. Ecological Applications 14 :1766 1779. Tipton, H. C., V. J. Dreitz, and P. F. Doherty. 2008. Occupancy of mountain plover and burrowing owl in Colorado. Journal of Wildlife Management 72 : 1001 1006. Tyre, A. J., B. Tenhumberg, S. A. Field, D. Niejalke, K. Parris, and H. P. Possingham. 2003. Improving precision a nd reducing bias in biological surveys: Estimating false negative error rates. Ecological Applications 13 :1790 1801.
71 Walk, J. W., K. Wentworth, E. L. Kershner, E. K. Bollinger, and R. E. Warner. 2004. Renesting decisions and annual fecundity of female Dic kcissels (Spiza americana) in Illinois. A uk 121 :1250 1261. Ward, G., T. Hastie, S. Barry, J. Elith, and J. R. Leathwick. 2008. Presence only data and the EM algorithm. Biometrics. Winchell, C. S. and P. F. Doherty. 2008. Using C alifornia gnatcatcher to tes t underlying models in habitat conservation plans. Journal of Wildlife Management 72 :1322 1327. Wright, A. L., G. D. Hayward, S. M. Matsuoka, and P. H. Hayward. 1998. Townsend's Warbler (Dendroica townsendi). in A. Poole, editor. The Birds of North America Online. Cornell Lab of Ornithology, Ithaca; Retrieved from the Birds of North America Online: http://bna.birds.cornell.edu/bna/species/333
BIOGRAPHICAL SKETCH Christopher Rota grew up in a small town in upstate New York. He first went to school at e nvironmental s tudies. A wandering spirit then brought him through 47 U.S. states, 7 European countr ies, and to a small cabin in West Yellowstone, Montana. Because of his love for the outdoors and everything wild, he returned to school at the University of Montana and completed w ildlife b iology. Graduate school first took him to the w ildlife e cology and c onservation. Ne x t, he will go to the University of Missouri, to earn a D octor of P hilosophy degree in f isheries and w ildlife s cience.