|
Citation |
- Permanent Link:
- https://ufdc.ufl.edu/UF00097460/00001
Material Information
- Title:
- Population density estimation using line transect sampling /
- Creator:
- Ondrasik, John Anthony, 1951-
- Place of Publication:
- Gainesville, Fla.
- Publisher:
- University of Florida
- Publication Date:
- 1979
- Copyright Date:
- 1979
- Language:
- English
- Physical Description:
- viii, 92 leaves : ill. ; 28 cm.
Subjects
- Subjects / Keywords:
- Approximation ( jstor )
Density estimation ( jstor ) Estimation methods ( jstor ) Expected values ( jstor ) Maximum likelihood estimations ( jstor ) Population density ( jstor ) Population estimates ( jstor ) Random variables ( jstor ) Statistical discrepancies ( jstor ) Wildlife population estimation ( jstor ) Animal populations -- Mathematical models ( lcsh ) Dissertations, Academic -- Statistics -- UF Plant populations -- Mathematical models ( lcsh ) Statistics thesis Ph. D
- Genre:
- bibliography ( marcgt )
non-fiction ( marcgt )
Notes
- Thesis:
- Thesis--University of Florida.
- Bibliography:
- Bibliography: leaves 90-91.
- Additional Physical Form:
- Also available on World Wide Web
- General Note:
- Typescript.
- General Note:
- Vita.
- Statement of Responsibility:
- by John A. Ondrasik.
Record Information
- Source Institution:
- University of Florida
- Holding Location:
- University of Florida
- Rights Management:
- Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
- Resource Identifier:
- 023347944 ( AlephBibNum )
06591994 ( OCLC ) AAL3076 ( NOTIS )
|
Downloads |
This item has the following downloads:
|
Full Text |
POPULATION DENSITY ESTIIMATION USING
LINE TRANSECT SAMPLING
BY
JOHN A. ONDRASIK
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1979
To Toni
For Her Love and Support
ACKNOWLEDGMENTS
I would like to thank my adviser, Dr. P. V. Rao, for his
guidance and assistance throughout the course of this research.
His patience and thoughtful advice during the writing of this
dissertation is sincerely appreciated. I would also like to
thank Dr. Dennis D. Wackerly for the help and encouragement
that he provided during my years at the University of Florida.
Special thanks go to my family for the moral support
they provided during the pursuit of this degree. I am espe-
cially grateful to my wife, Toni, whose love and understand-
ing made it possible for me to finish this project. Her
patience and sacrifices will never be forgotten.
Finally, I want to express my thanks to Mrs. Edna Larrick
for her excellent job of typing this manuscript despite the
time constraints involved.
iii
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS . . . . . . . . . iii
LIST OF TABLES ... . .... . . . . . . . .vi
ABSTRACT . . . . . . . . . . . . vii
CHAPTER
I INTRODUCTION . . . . . . . . . 1
1.1 Literature Revieu . . . . . . 1
1.2 Density Estimation Using Line Transects 4
1.3 Summary of Results . . .. . . . .9
II DENSITY ESTIMATION USING THE INVERSE
SAMPLING PROCEDURE . . . . . . . 13
2.1 Introduction . . . . . . . 13
2.2 A General Model Based on Right Angle
Distances and Transect Length . . .. 14
2.2.1 Assumptions . . . . . 15
2.2.2 Derivation of the Likelihood
Function . . . . . . 16
2.3 A Parametric Density Estimate . . 28
2.3.1 Maximum Likelihood Estimate for D 28
2.3.2 Unbiased Estimate for D . . .. 29
2.3.3 Variance of 6 . . . .31
2.3.4 Sample Size Determination Using u 32
2.4 Nonparametric Density Estimate . . . 34
2.4.1 The Nonparametric Model for
Estimating D . . . . . . 36
2.4.2 An Estimate for fy(O) . . .. 37
2.4.3 Approximations for the Mean and
Variance of (0) . . . 40
2.4.4 A Monte Carlo Study . . . 42
2.4.5 The Expected Value nnd Variance for
a Nonparamctric Estimate of D. . 46
2.4.6 Sample Size Determination Using DN 47
TABLE OF CONTENTS (Continued)
CHAPTER Page
III DENSITY ESTIMATION BASED ON A COMBINATION
OF INVERSE AND DIRECT SAMPLING .. . . .. 49
3.1 Introduction . . . . . . . 49
3.2 Gates Estimate . . . . . . . 50
3.2.1 The Mean and Variance of 6 .. ... 54
g
3.3 Expected Value of DCp ... ... .... 57
3.4 Variance of DCp . . . . . . 65
3.5 Maximum Likelihood Justification for DCP. 69
IV DENSITY ESTIMATION FOR CLUSTERED POPULATIONS .71
4.1 Introduction . . . . . . 71
4.2 Assumptions . . . . . . . . 73
4.3 General Form of the Likelihood Function .76
4.4 Estimation of D when p(-) and h(-)
Have Specific Forms . . . . . . 79
4.5 A Worked Example . . . . . . 86
BIBLIOGRAPHY .. . . . . . .. . .. . 90
BIOGRAPHICAL SKETCH . . . . . . . .92
LIST OF TABLES
TABLE Page
1 Number of animals, No, that must be sighted to
guarantee the estimate, D has coefficient of
variation, CV(Du) . . . . . .. . 34
2 Forms proposed for the function, g(y) . . .. 36
3 Results of Monte Carlo Study using g,(y) =e-10v 45
4 Results of Monte Carlo Study using g2(y) = -y 45
5 Results of Monte Carlo Study using g3(y) = 1-y 46
6 Number of animals, No, that must be sighted
to guarantee the estimate DN has coefficient
of variation, CV(DN) . . . . . . . 48
Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
POPULATION DENSITY ESTIMATION USING
LINE TRANSECT SAMPLING
By
John A. Ondrasik
December 1979
Chairman: Pejaver V. Rao
Major Department: Statistics
The use of line transect methods in estimating animal
and plant population densities has recently been receiving
increased attention in the literature. Many of the density
estimates which are currently available are based only on
the right angle distances from the sighted objects to a
randomly placed transect of known length. This type of sam-
pling, wherein an observer is required to travel along a line
transect of some predetermined length, will be referred to
as the direct sampling method. In contrast, one can use an
inverse sampling plan which will allow the observer to termi-
nate sampling as soon as he sights a prespecified number of
animals.
An obvious advantage of an inverse sampling plan is that
sampling is terminated as soon as the required number of
objects are sighted. A disadvantage is the possibility that
sampling may not terminate in any reasonable period of time.
Consequently, a third sampling plan, in which sampling stops
vii
as soon as either a prespecified number of objects are sighted
or a prespecified length of the transect is traversed, is of
practical interest. Such a sampling plan will be referred to
as the combined sampling method. The objective of this dis-
sertation is to develop density estimation techniques suit-
able for both inverse and combined sampling plans.
In Chapter II, both a parametric and a nonparametric
estimate for the population density are developed using the
inverse sampling approach. We will show that a primary
advantage of estimation using inverse sampling is the fact
that these estimates can be expressed as the product of two
independent random variables. This representation not only
enables us to obtain the expected value and variance of our
estimates easily, but also leads to a simple criterion for
sample size determination.
In Chapter III, we derive a parametric density estimate
that is suitable for the combined sampling method. This esti-
mate will be shown to be asymptotically unbiased. An approx-
imation to the variance of this estimate is also provided.
The density estimates developed in Chapters II and III
are based on the assumption that the sightings of animals are
independent events. In Chapter IV we relax this assumption
and develop an estimation procedure using inverse sampling
that can be applied to clustered populations--those popula-
tions composed of small groups or "clusters" of objects.
viii
CHAPTER I
INTRODUCTION
1.1 Literature Review
Our objective in this dissertation is to examine the
problem of density estimation in animal and plant populations.
The demand for new and more efficient population density esti-
mates has grown quite rapidly in the past few years. Anderson
et al. (1976, p. 1) give a good assessment of the present sit-
uation and provide some reasons for the renewed interest in
this subject in the following paragraph:
The need to accurately and precisely estimate the
size or density of biological populations has increased
dramatically in recent years. This has been due largely
to ecological problems created by the effects of man's
rapidly increasing population. Within the past decade,
we have witnessed numerous data gathering activities
related to the Environmental Impact Statement (lJEPA)
process or Biological Monitoring programs. Environmental
programs related to phosphate, uranium and coal mining
and the extraction of shale oil typically require esti-
mates of the size or density of biological populations.
The Endangered Species Act has focused attention on the
lack of techniques to estimate population size. It now
appears that hundreds of species of plants may be pro-
tected under the Act, and, therefore, we will need infor-
mation on the size of some plant populations. Estimation
of the size of biological populations was a major objec-
tive of the International Biological Program (IBP) (Smith
et al. 1975). Finally, we mention that the ability to
estimate population size or density is fundamental to
efficient wildlife and habitat management and many impor-
tant studies in basic ecological research.
The estimation of population size has always been a very
interesting and complex problem For a recent review of the
general subject area see Seber (1973). Although many of the
methods described in Seber's book are quite useful, they are
frequently very expensive and time consuming. Estimation
methods based on capture-recapture studies would fall into
this category. A further problem with many estimation meth-
ods is that they are based on models requiring very restric-
tive assumptions which severely limit their use in analyzing
and interpreting the data. For these reasons and others,
line transect sampling schemes are becoming more and more
popular. This method of sampling requires an observer to
travel along a line transect that has been randomly placed
through the area containing the population under study and
to record certain measurements whenever a member of the popu-
lation is sighted. There are several density estimation tech-
niques available using line transect data; however, the full
potential is yet to be realized.
Density estimation through line transects is typically
practical, rapid and inexpensive for a wide variety of popu-
lations. Published references to line transect studies date
back to the method used by King (See l.copold, 1933) in the
estimation of ruffed grouse populations. Since that time,
numerous papers investigating line transect models have
appeared, e.g., Webb (1942), Hayne (1949), Robinette et al.
(1954), Gates et al. (1968), Anderson and Pospahala (1970),
Sen et al. (1974), Burnham and Anderson (1976) and Crain
et al. (1978). Since it is commonly assumed by these authors
that the objects being sampled are fixed with respect to the
transect, line transect models are best suited for either
immotile populations, flushing populations (populations where
the animal being observed makes a conspicuous response upon
the approach of the observer) or slow moving populations.
Examples of such populations are:
(i) immotile birds' nests, dead deer and plants,
(ii) flushing grouse, pheasants and quail, and
(iii) slow moving desert tortoise and gila monster.
The degree to which line transect methods can be applied to
more motile populations, such as deer and hare, will depend
on the degree to which the basic assumptions are met. In any
case, one should proceed cautiously when using these models
for motile populations.
Despite the wide applicability of line transect methods,
the estimation problem has only recently begun to receive
rigorous treatment and attention from a statistical standpoint.
Gates et al. (1968) were the first to develop a density esti-
mation procedure within a standard statistical framework.
After making certain assumptions with regard to the probabil-
ity of sighting an animal located at a given right angle
distance from the transect, they rigorously derived a popu-
lation density estimate. In addition, they were the first
authors to provide an explicit form for the approximate sam-
pling variance of their density estimate.
1,
While the assumptions of Gates et al. (1968) concerning
the probability of sighting an animal did work well for the
ruffed grouse populations they were studying, it is clear
that the validity of their assumptions will be quite crucial
in establishing the validity their density estimates. If
the collected data fail to substantiate their assumptions,
large biases could occur in the estimates as seen in Robinette
et al. (1974). As a result, Sen et al. (1974) and Pollock
(1978) relaxed the assumptions of Gates et al. (1968) by
using more general forms for the sighting probability, while
Burnham and Anderson (1976) developed a nonparametric approach
as a means of providing a more robust estimation procedure.
In the following sections, we will outline the general
problem of density estimation using line transects, give
our approach to the solution of this problem and summarize
the results found in the remainder of this work.
1.2 Density Estimation Using Line Transects
The line transect method is simply a means of sampling
from some unknown population of objects that are spatially
distributed. In the context of animal or plant population
density estimation, these objects take the form of mammals,
birds, plants, nests, etc., which are distributed over a par-
ticular area of interest. From this point on, our refer-
ences will always be to animal populations with the under-
standing that the estimation methods we describe are appli-
cable to all populations which satisfy the necessary assump-
tions.
In the line transect sampling procedure, a line is ran-
domly placed across an area, A, that contains the unknown
population of interest. An observer follows the transect and
records one or more of the following three pieces of informa-
tion for each animal sighted:
(i) The radial distance, r, from the observer to the
animal.
(ii) The right angle distance, y, from the animal to the
line transect.
(iii) The sighting angle, 8, between the line transect
and the line joining the observer to the point at
which the animal is sighted.
These measurements are illustrated in Figure 1.
Figure 1. Measurements recorded using line transect sampling.
(Z is the position of an observer when an animal
is sighted at X. XP is the line from the animal
perpendicular to the transect.)
In this work, we shall consider the problem of estimating
population density using only the right angle distances.
Because estimates depending only on right angle distances are
easy and economical to use, such estimates have become very
popular over the past several years.
Before any estimation procedure based on right angle
distances can be formulated, certain assumptions regarding
the population of interest must be made. A set of assump-
tions used by several workers in the area is detailed in
Section 2.2.1. One of the key assumptions in this set is
that the probability of sighting an animal located at a right
angle distance, y, from the transect can be represented by
some nonincreasing function g(y), which satisfies the equal-
ity, g(0) = 1. This function is simply a mathematical tool
for dealing with the fact that animals located closer to the
line transect will be seen more readily than animals located
further away from the transect. An alternative method of
dealing with this phenomenon is given by Anderson and
Pospahala (1970).
If g(y) is assumed to have some specific functional form
determined by some unknown parameters, then the estimate is
said to be parametric. On the other hand, if g(y) is left
unspecified except for the requirements that it is nondecreas-
ing and g(0) =1, then the estimate is said to be nonparametric.
Seber (1973) has shown that any density estimate based on
right angle distances will have the form
N
s 2LoC'
where N is a random variable representing the number of
animals seen in a line transect of length Lo and c is an
estimate for c, a parameter which depends on g(y) through
the relation
c = g(y)dy.
By noting that the density is simply the number of animals
present per unit of area, it is clear that c can be inter-
preted as one-half of the effective width of the strip actu-
ally covered by the observer as he moves along the transect.
Further examination of Ds also points out that estimating
the parameter c is the key to the estimation problem.
At this time, we would like to point out that the range
for the right angle distance, y, is allowed to go from 0 to
+-, as seen in the integral on the right hand side of the
equation for c. In practice, since we are considering only
a finite area, A, there will most certainly be a maximum
observation distance, W, perpendicular to the transect.
However, if W is large enough so that the approximation
Sg(y)dy g(y)dy (1.1)
is reasonable, then letting y range in the interval (0,+-)
will not cause any real problems. In practical terms, this
means that the probability of observing an animal located
beyond the boundary, W, should be essentially zero.
In most real life situations, W can be chosen large
enough so that the approximation given in (1.1) is valid.
Thus, in the chapters which follow, we will implicitly
assume chat relation (1.1) holds for the density estimates
that we develop.
Both parametric and nonparamecric models have been used
to derive an estimate for the parameter c, and, consequently,
for the population density. In both cases, the estimate for
c turns out to be a function of the observed right angle
distances. In the parametric case, c will simply be a func-
tion of the parameters that define the function chosen for
g(y). Examples of parametric estimates are found in Gates
et al. (1968), Sen et al. (1974) and Pollock (197S).
Estimation using the nonparametric model is more compli-
cated. Burnham and Anderson (1976) have shown that estimat-
ing 1 is equivalent to estimating fy(0), where fy(.) is the
conditional probability density function for right angle
distance given an animal is sighted. Thus, the problem of
finding a nonparametric estimate for the population density
reduces to the problem of estimating a density function at a
given point. Unfortunately, this problem has not received
much attention in the literature. Burnham and Anderson (1976)
suggest four possible estimates for fy,(0), but the sampling
variances associated with these estimates have not been
established.
Crain et al. (1978) have also considered the problem of
estimating fy(0). They derive an estimate using a Fourier
Series expansion to approximate the conditional probability
density function fy(y). Although their procedure does not
lead to a simple estimate, they do provide an approximation
to its sampling variance.
The line transect method and the corresponding population
density estimates so far described require the observer to
travel a predetermined distance, Lo, along the transect.
This methodwillbe called the direct sampling method. An
alternative to the direct method is the inverse sampling
method, wherein sampling is terminated as soon as a speci-
fied number, No, of animals are sighted. Clearly, in the
direct method, the number of animals seen is a random variable
and the total length travelled is a fixed quantity, while in
inverse sampling method, the total length travelled is the
random variable and the number of animals chat must be seen
is fixed. The main focus of this work will be to develop
density estimation techniques that are based on the inverse
sampling method. In addition, we will consider the density
estimation problem when a combination of the inverse and
direct sampling plans is used.
1.3 Summnar. of Results
In Chapter 2 we derive two estimates for the population
density, D, using an inverse sampling scheme. The set of
assumptions which justify the use of these estimates is
similar to those used by Gates et al. (1968) and several
others. The estimates have the form
D =
where No is the number of animals thac must be seen before
sampling terminates, L is a random variable representing the
length travelled on the transect and c is as previously
defined. Note the similarity of bI to Ds given in Section 1.2.
The only difference between the two estimates is that in bI
the random variables are L and E, while in s they are II and c.
However, this difference gives the inverse sampling method a
theoretical advantage over the direct sampling method. The
random variables L and E will be seen to be independent while
N and c are not. Thus, the estimate DI is the product of two
independent random variables, a fact which not only allows
us to obtain its expected value and variance easily, but also
leads to a simple criterion for sample size determination.
Both a parametric and a nonparametric estimate for the
animal population density are developed in Chapter II. In
deriving the parametric estimate, the functional form assumed
for g(y) is identical to the one used by Gates et al. (1968).
Our parametric density estimate is shown to be unbiased and
the exact variance of this estimate is also provided.
In the nonparametric case we propose an estimate for
f (0) using the method developed by Loftsgaarden and
Quesenberry (1965). We then use heuristic reasons to show
that the corresponding density estimate is asymptotically
unbiased, and derive a large sample approximation for its
variance.
The inverse sampling method does have one drawback when
there is little information available concerning the popula-
tion to be studied, namely, there exists the possibility that
an observer might have to cover a very long transect to sight
No animals. To overcome this problem, we develop a parametric
density estimate in Chapter III that is based on a combina-
tion of the inverse and the direct sampling procedures. In
the combined sampling scheme, sampling is terminated when
either a prespecified number, No, of animals are sighted or
when a prespecified length, Lo, has been travelled along the
transect. Thus, in combined sampling both the length trav-
elled and the number of animals seen will be random variables.
In deriving the density estimate based on the combined
sampling method, we again use the functional form for g(y)
proposed by Gates et al. (1968). This estimate is shown to
be asymptotically unbiased. In addition, an approximate
variance for this density estimate is provided.
The density estimates developed in Chapters II and III
are based on the assumption that the sightings of animals
will be independent events. Gates et al. (1968) showed that
this assumption failed to hold for the animal population they
were studying. In Chapter IV we relax this assumption, and
develop an estimate based on inverse sampling that can be
12
applied to clustered populations--populations in which the
animals aggregate into small groups or"cluscers." Since
the estimation procedure developed will require the use of
a high-speed computer, the last section of Chapter IV is
devoted to a worked example to illustrate the computations
that would be involved.
CHAPTER II
DENSITY ESTIMATION USING THE INVERSE
SAMPLING PROCEDURE
2.1 Introduction
In this chapter we shall propose estimates for animal
population density based on an inverse sampling procedure.
Unlike the direct sampling method considered by Gates et al.
(1968), the inverse sampling procedure specifies the number
of animals that must be sighted before the sampling can be
terminated. Thus, in the inverse case the number of animals
sighted will be a fixed rather than a random quantity.
A precise formulation of the inverse sampling method is as
follows:
1. Place a line at random across the area, A,
to be sampled.
2. Specify a fixed number, No, and sample along the
line transect until No animals are observed.
As one proceeds along the transect, certain measurements
will be made. These will be denoted by yl,y2,'... ,yj and
Z, where y. is the right angle distance from the ih animal
observed to the transect and k is the total distance trav-
elled along the transect during the observation period.
A visual depiction of these measurements is given in Figure 2.
Figure 2. Measurements recorded using inverse sampling.
2.2 A General Model Based on Right Angle
Distances and Transect Length
The estimates for the density, D, that we will develop
are based on the right angle distances, yly2 ...' YNo and
the total distance, Z, travelled along the transect. TJo
possible approaches to the estimation of D merit consider-
ation. First, recall that density is defined as the number
of animals present per unit of area, or equivalently the
rate at which animals are distributed over some specific
area. Therefore, we can write
D =
where A is the area of interest and N is the total number
of animals present in A. In the direct sampling approach
the estimation of D is most often accomplished by first
estimating N and then dividing by A. Seber (1973) shows
that any estimate of N based on direct sampling has the
form
^* NA
2LoC
where N is a random variable denoting the number of animals
seen, L, is the length of the transect and c is an estimate
of c, a parameter which depends on the probability of sight-
ing an animal given its right angle distance from the tran-
sect. Note that in Seber's estimate, N is random and Lo is
fixed. It follows then, that Seber's estimate for D does
not depend explicitly on A and has the form
B = -N
s 2Loc
Therefore, the estimate of D is independent of the actual
size of A, a property that any reasonable estimate of D
should possess.
As an alternative, D itself can be regarded as the basic
parameter of interest and estimates for D can be derived
directly. This is the approach taken by Burnham and Anderson
(1976) and the one that we will follow in developing our
estimates.
2.2.1 Assumptions
The form of any estimate of D, the animal population
density, will depend upon the type of assumptions we can
make regarding the distribution of the animals to be censused
and the nature of the observations that will be made. The
assumptions our estimates will be based on are as follows:
Al. The animals are randomly distributed with rate or
density D over the area of interest A, i.e., the
probability of a given animal being in a particu-
lar region of Area, 6A, is 6A/A.
A2. The animals are independently distributed over A,
i.e., given two disjoint regions of area, 6A1 and
6A2'
P(n1 animals are in A1 and n2 animals are in 6A2)
= P(n1 animals are in 6A1)P(n2 animals are in 65A).
A3. The probability of sighcing an animal depends
only on its distance from the transect. In addi-
tion, there exists a function g(y) giving the
conditional probability of observing an animal
given its right angle distance, y, from the tran-
sect. In probability notation,
g(y) = P(observing an animal I y).
A4. g(0) = 1, i.e., animals on the line are seen with
probability one.
A5. Animals are fixed, i.e.. there is no confusion
over animals moving during sampling and none are
counted cwice.
2.2.2 Derivation of the Likelihood Function
We will use the maximum likelihood procedure to obtain
an estimate for D. The joint density function we are inter-
ested in is
f ,L(v .; No)
where Y = (Y ,Y2' ,. . Y ) is the vector of random variables
1 2 0
representing the right angle distances, L is the random var-
iable representing the total length travelled, and No is the
specified number of animals to be seen before sampling ter-
minates. Since the dependence of the joint density on No is
implicit throughout the rest of this chapter, it will be
dropped from our notation for convenience. Thus, from now
on we will denote the density as
fy,L ( ,'),
and all other expressions depending on No in this manner will
be handled accordingly.
The following two theorems will be very useful in the
derivation of the likelihood function.
Theorem 1: Let N(Z) denote the number of animals sighted in
the interval (0,O] along the transect. Then, N(Z) is a
Poisson process, and for some > 0,
n!
Note that the quantity O8 equals the expected number of
animals sighted per segment of length I.
Proof: In order to show that N(A) is a Poisson process,
we will show that the assumptions in Section 2.2.1 imply the
postulates necessary for a Poisson process given in Lindgren
(1968, p. 162).
First, consider two disjoint intervals, 1 and Z2' along
the transect and the corresponding areas, A(1) and A(Q2),
enclosed by lines perpendicular to the transect as shown
in Figure 3.
Figure 3. Two disjoint areas along the transect.
Now let N1 and N2 be random variables representing the total
number of animals that occupy A(Q1) and A(r.2), respectively.
By definition, N(QI) and N(S2) are the number of animals
sighted in A(UI) and A(r2), respectively. We know from assump-
tion A2 that N1 and N2 are independent, and from assumption A3
that sighting an animal depends only on its distance from the
transect. Thus, N(C1), which depends solely on N1 and the
distances to the NI animals from the transect, is independent
of N(Q2), i.e., the number of sightings that occur in two
disjoint intervals along the transect are independent events.
Next we will show that for every >m> 0 and any h >0,
N(e)-N(m) and N(C+h)-N(m+h) are idcnticailly distributed.
First, note that the effective area sampled in seeing N(O)-N(m)
animals and N(R.+h)-N(m+h) animals is equal to A(.-m) as seen
in Figure 4.
Figure 4. Effective area sampled in seeing N(Q)-N(m)
animals and N(Z+h)-N(m+h) animals.
Therefore, by assumptions Al, A2, and A3, and since the tran-
sect is dropped at random, it follows that
P{N(Z)-N(m) =j}=P{N(Z+h)-N(m+h) =j), j =0,1,2,...
Next we must show that for every Z > 0, and some 9 > 0,
P{N(z) = 1 = B+o(Z), as O,
where o(Z) is a function such that
lim () 0.
Again let A(Z) be the area defined by t on the transect. Noow
define B. to be the event {N(P.)=j} and E. to be the event
J J
that there are exactly j animals in area A(Z). Then it fol-
lows that
0o
P(B ) = E P(BIE )
j=l
= P(BI|Ej)P(Ej).
j=1
20
Under assumptions Al and A2, Pielou (1969, p. 81) has shown
that
-DA(C)
P(Ej) DA()
P(E ) = e- j =0,1,2... .
Also, under assumptions Al, A2 and A3, Seber (1973, Eq. (2.6))
has shown that
P(BIEl) i T
where
c = g(y)dy. (2.1)
'0
Therefore, we can write
P(B1) = 2cDe DA(Q) + 7 P(B 1Ej)P(E )
j=2
and if we show
SP(BlIE.)P(E.) = o(C)
j=2
the proof will be complete. Note that
CO M DA()
e -DA()[DA(C)1J
P(B1 E.)P(E ) Z e- .IDA(
j=2 j=2 J.
S-DA() jDA(9) -1
= DA(.) L e'
j=2
C" -DA(9.) j-1
e AIDA(e)] D
<-j DA() 7-
j=2. (j-1)!
j 2
= DA(.)I1-e-DA()I.
For any finite area A, A(Z) is 0(Z), that is
lim A() < K, for some K> 0.
', -
Therefore, as -+ 0
E P(BI E.)P(E ) = o().
j=2
and,. upon writing
0 = 2cD, (2.2)
we get, as 0,
P(B ) = e9 + o(Z).
Finally, we need to show that for every Z > 0,
E P{N() =n} =o(Z), as 0.
n>l
Note that for all n> 1, we can write
P(B ) = E P(B Ej)
j=l
= P(B IE.)P(Ej ).
j=n
Again, by using the fact that A(Z) is 0(,), it is easy to
show that
P(Bn) = o(9A), as 0,
and N(k) satisfies the four conditions necessary for a Poisson
process.
Before proceeding to the second theorem, we need to define
the following random variables. Let T. denote the random
i~
variable corresponding to the distance travelled on the
transect between sightings of the (i-1)st and ith animals,
i = 1,2,... ,N. Then the total distance travelled is given by
No
L = T..
i=l
The following theorem establishes the independence of
Y and T1,T2 ... T1 for the case No = 2, and this fact enables
us to derive the joint density function, fyL(v,,).
Theorem 2. The random variables T1, T2, Y1 and Y2 are mutually
independent.
Proof: In order to establish the independence of T 2, T2,
Y1 and Y2 we will derive the joint density
fT1 T2 1 (t t2 l '2)
and show that it can be factored into four functions, each
depending on only one of the random variables of interest.
Let 1' Y,2' t1' t2' h1, h2' g1 and g2 be non-negative
real numbers such that
t + h < I + t2'
l + g1 2
as shown in Figure 5.
TI~1
fIn
tl+t2 tl+t2+h2
Figure 5.
Areas defined by yl,y2,tl,t2',g1g2,hl, andh2.
Now let
P(h1,g81h2,g2) = P(t1 < T2 tl+hl'Y1 < Y 1 < 1+g
t2 < T2 t2 +h2'Y2 < Y2 y2 +g2).
Then we can write
P(hl,g' ,h2'g2)
fTl T Y Y (tlt2I'yl2) h lim h
2' '1 2 h. 0 hlgl2g2
gi 0
i = 1,2
provided the limit exists.
Now notice that the event whose probability we wish to
find, namely
{t < T st] +hl'I < Y1 <1 +glt2 < T2 t2 +h2 'y2
is equivalent to the intersection of the following events:
Si, the event {N(tl) = 0}
S2, the event {N(tl+hl)-N(tl) = 1} and {yl
i.e., an animal is seen in area I
Y2
Yl+gl
Yl
t1 tl+hl
S, the event (N(t +t2) N(tl+h) = 0)
S the event (N(tl+t2+h2) N(tl+t2) = 11 and
{y2.'Y2 2+g}2, i.e., an animal is seen in area II.
Now, by Theorem land Assumption A3,.the events S1, S2, S3
and S4 are independent so that we can write
P(hl,g1,h2,g2) = P(S1S2S3S4)
= P(SI)P(S2)P(S3)P(S ).
We now need to find expressions for the probabilities of
Si, S2' S3 and S Since N(9.) is a Poisson process,
P(S ) = e- ,
and
P(S3) = e-0(c-h)
However, P(S2) and P(SA) are not so easil: obtained. We will
only show how to find P(S2), since P(SL) is found in a similar
fashion. First, define S2j to be the event that there are
exactly j animals in area I. Then
P(S2) = E P(S2S2j)
j=1
= P(S2 |S2j)P(S2j).
j=l
By assumptions Al and A2, the number of animals located in
area I will be distributed as a Poisson random variable with
parameter 2Dglhl, where D is the density of the animals (see
Pielou, 1969, p. 81). Note, the factor of 2 comes in since
area I can be found on both sides of the transect. Therefore,
P(S2j) =
-2Dglhl j
e (2Dglh1)
S = 0,1,2,...
By assumption A3,
P(S21S21) = g(y),
for some yl
that found in Theorem 1, as gl-0 and h-*0,
-2Dglh1 O
P(S) = 2Dglhle g(yl) .+ P(S21S2j)P(S2j)
j=2
-2Dglh+o(g
= 2Dglhle g(y{)+o(glhl).
Similarly, we can show that as -2+0 and,h2-0,
P(Se-2Dg2h2 +o(g2h2
P(S4) = 2Dg2h2e g(yn)+o(g~h2),
for some y2
for P(S1), P(S2), P(S3) and P(S4) into P(hl,gl,h2,g2)
to obtain
P(hl,g' ,h2,g2) =e
-9t1 -e(t2-h1)
e {2Dglhle
-2Dglh1
g(yl)+ C(glhl ))}
x { h-2Dg2h2 + 2h2
x {2Dg2h2e g(y)+ g2h2)}.
Consequently,
P(h,g ,h2'g2) 2 -0t1 -0t2
lil h" = 4D e e g(yl)g(y2),
h.-*0 81hlg2h2
gi0
i=1,2
which completes the proof of Theorem 2.
(2.3)
In the same manner, we can show that the independence
established in Theorem 2 will hold for any finite number of
sightings, No. In this case if T = (T, ,T2... ,TN and
Y= (Y' ,...,' ), then (2.3) becomes
No
-e~t
i=1
f T,(t,v) = 2 ODNoe 1 g(y ).
i=l
Upon using equation (2.2) in fT y(tv), we get
-e t NO
f ,Y(c,v) = e'e i=1 c No
-i=l 1
Thus, the marginal distributions for T. and Y. are
g(yi)
g (y i)
S iand c
-Ot.
f (ti) = 9e t.> 0.
Therefore, T1,T,, .. TN are independent, identically distri-
buted (iid) as Exponential random variables with parameter 8,
and
No
L = T. T.
has a Gamma distribution with parameters No and 0, i.e.,
oNo No-1 e-
S(9ro) = NO e .>0, 8>0.
L (4f(N0)
Furthermore, L is independent of Y.
The likelihood function for the estimation of 0 and c
can now be obtained by taking the product of fL(M) and
fy(z), i.e.,
NoM N N o-le-0O
L(O,c;y, ) = g(yi) F(No) (2.4)
c i=1
We will now outline how one can estimate D, the animal
population density, from the likelihood function given in
(2.4). As noted earlier, D is related to 0 and c by equation
(2.2), i.e.,
D -
Thus, the maximum likelihood estimate for D would be
where 0 and c are maximum likelihood estimates of 0 and c,
respectively, obtained from (2.4). Note that the estimate
D is the ratio of two mutually independent random variables,
one depending on L alone and the other depending on Y alone.
This property will be found to be very useful when evaluating
the moments of 6.
We have now set the framework necessary for deriving
an estimate of D. In the next section we shall obtain an
estimate for D assuming that g(y) has a particular parametric
form.
2.3 A Parametric Densicy Estimate
Any estimate for D that is derived after assuming an
explicit function for g(y) will be called a parametric esti-
mate. Gates et al. (1968), using direct sampling, derived an
estimate for D assuming
g(y) = e-
Using this same function for g(y), we will derive the corre-
sponding estimate based on inverse sampling.
2.3.1 Maximum Likelihood Estimate for D
To estimate D we need to estimate both 0 and c from the
likelihood function (2.4). In this case
g(y) = e- :>0, \>0
so that
1
c = .
Substituting for c in (2.2) yields
D = _- (2.5)
Also, by substituting for c in (2.4), the likelihood function
becomes
No
-\. 'i nr n"o-1 -Oz
L(0,A, ,(J,) = i=1e i 0 e.0, y.>0. (2.6)
The joint maximum likelihood estimates for 0 and A can
now be easily obtained. The natural logarithm of the likeli-
hood function is
No
ZnL(0,A;y,Z) = NolnA-A yi+N onO+(No-1)Zn-O-Z- Inlr(No).
i=l
Taking the partial derivatives with respect to 9 and X yields
anL(9,A;y,z) N
3ZnL(,X;y,j) _N NN
Setting these equal to 0 yields
^ No
and
i=l
Substituting these estimates for 0 and A in (2.5), the
maximum likelihood estimate for D is seen to be
^^ 2
No
D 2 No
2 E y.
i=l
2.3.2 Unbiased Estimate for D
The expected value of the estimate D,developed in
0 A
= E(6)E(A)
since 0 and A are independent. Using the fact that L has a
Gamma distribution with parameters No and 0, we obtain
= .No
E(A) = E(
^.J0e
(No-1) i
To derive an expression for E(.), first recall that
Y . ,Y are iid with the common density
f (y) = = Ae- y>0.
C
No
Therefore, E Y. is distributed as a Gamma random variable
i=l
with parameters No and \ and
E(X) = E N
1. 0
(No- 1)
Independence of U and X now yields
E(D) = E(6)E(,)
-2 2
(N o- 1)
=- D.
(No-1)
An unbiased estimate for D is, therefore, given by
u 2
No
2 Lu(2.7)
2L E Y.
i=l 1
2.3.3 Variance of D
U
No
Due to the independence of L and E Y. the variance of
i=l
D can be derived directly. We have
-1)
Var (D ) = Var (N0O
2L E Y.
(N-i=l ]
M-1) Var I
SVar N
L E Y.
i=l
(No-1) 1 2
L E Y. L E Y.
2C Y 2
i=l i=l
No
Since L and E Y. are independent it follows that
i=l 1
Va(D) (No-I)4 EL-_-\E~ 1 E_2 iyi_
Var(D) = E E E (2.8)
S Y E Y.,
i=1 / i=1
Deriving the Var(D ) now reduces to the problem of evaluating
1 1 1 1
the expected values for L' N. and Expres-
E Y. L EOY
i=l I i=l e
sions for these quantities are easily obtained by noting that
No
L and E Y. have Gamma distributions with respective parameters
i=l
No,0 and No,A. Straightforward calculations show that for
No>2,
1 0
E() = (N0- (2.9)
E Y.
E( y' (N -1)(1)
\i=1
1 0_ (2.11)
E (No- (N-2) (212)
Now using (2.5), (2.9), (2.10), (2.11), and (2.12) in (2.8)
we get
4 2 A 2 9
Var(D ) 0 4- 2 -AF
u 4 (No- )2(No- )2 (No-l)4
(to-L) 0-A j 1 1
(No-2) (1o-1)
= D2 (2 (2.13)
(No-2 )
provided No>2. Note that Var(D ) does not exist if Nos2.
2.3.4 Sample Size Decermination Using D
The first problem in designing a survey using the inverse
line transact method is to determine in advance the number
of animals, No, that must be sighted before sampling terminates.
One criterion for the selection of No (see Sober, 1973) is Lhe
requirement that the design must yield an estimate of the
density, D, with a prescribed coefficient of variation,
CV =--
E(D)
where o^ and E(D) denote, respectively, the standard deviation
D
and the expected value for the estimate, D. As one can
see immediately, small values of CV are desirable since this
indicates that the estimate has a small standard deviation
relative to its expected value.
With the inverse sampling method, the value of No needed
to guarantee a preset value, C, for the coefficient of varia-
tion of D can be calculated easily. Using (2.7) and (2.13)
u
we see that, for No>2
CV(D) (2N-3)/2
CV( U) = N.
uNo-2
Then, setting C= CV(D ), it is easily shown that No is the
root of the quadratic equation
C2N2 (4C2+2)No+4C2+3 =0.
Solving for No yields the two roots
S1+(+C2)l/2
No = 2+
C
Since the variance of D exists only for No>2, the required
sample size is
N0 = 2+ 1++C2 )/2
C
For example, if C= .25, then No =35. Table 1 gives values of
No corresponding to coefficients of variation ranging from
.1 to .5.
Table I. Number of animals, No, that must be sighted to
guarantee the estimate, D has coefficient of
variation, CV(D ).
U
CV(Du) No
.50 11
.40 15
.30 25
.25 35
.20 53
.15 92
.10 203
2.4 Nonparametric Density Estimate
In this section we will consider a nonparametric estimate
for the population density, D, using inverse sampling. In
contrast to the parametric approach used in Section 2.3,
the nonparametric approach leaves the function g(y), which
represents the probability of observing an animal given its
right angle distance, unspecified.
In Section 2.2.2 we showed that an estimate for D is
given by
D -
2c
where I and c Jare tLhc stimates for 0, Lhe cxpcctcd number of
sightings per unit length of the transect, and c defined as
c = g(y)dy.
*O
If g(y) is completely specified, except for some parameters,
then the problem of estimating D reduces to the problem of
estimating 6 and the parameters in g(y). In Section 2.3 we
considered the specific case
g(y) = e-
A drawback to this approach, where we specify a functional
form for g(y), is that the function chosen must take into
account the inherent detection difficulties that are present
when a particular animal species is being sampled. If one
examines the various forms that have been suggested for g(y),
one quickly becomes aware of the problem of finding a form
that is flexible enough to accommodate the many possibilities
which exist. Some of the functions that have been proposed
for g(y) are presented in Table 2. As seen in the table, the
suggestions for g(y) represent a number of different shapes
in an effort to reflect the nature of the animal being sampled
and the type of ground cover being searched.
Because of the problems that can arise in choosing a
function for g(y), Burnham and Anderson (1976) considered
a nonparametric approach as a means of avoiding the need for
the specification of g(y). Leaving g(y) unspecified will
allow the estimation procedure to depend on the observations
that are actually miadc, not on any panrticulnr model. Thus,
a nonparametric model might provide a more robust estimation
method, that is, an estimation method that could be applied
to a much wider class of animal species.
Table 2. Forms proposed for the function, g(v).
Function
e
Author
, A>0
Gates et al. (1968)
g(y) =
g(y) =
a
1 -
0
0<: y < w
Eberhardt (1968)
'y>
1 O
0 >w
Seber (1973)
S..a- -P..
BX e-l
g(y) = e (c dx,
F(ci
y
B>0, a>0
Sen et al. (1974)
, p>O, \>0
Pollock (1978)
2.4.1 The [Jonparametric Hodel for Estimating D
Consider the estimate for D developed in Section 2.2.2,
that is
D 2a
As noted earlier, if
g(y) = e -
then
c= 1
A
and our estimate for D is
2
g(y) =
e
g(y)
Now, if g(y) is left unspecified, then an estimate for 1 may
be obtained along the same lines Burnham and Anderson (1976)
used in the case of direct sampling. By assumption A4,
f (0) = gO) 1
Y c c
1
Hence, f equals the value of the fy(') evaluated at y=0,
where fy(-) is the probability density function for the right
angle distance, Y, given an animal is seen. The problem of
1
finding a nonparametric estimate for -, therefore, reduces
to the problem of finding an estimate, y(0), for fy(0).
An estimate for D will then be given by
D = (2.14)
where 0 may be taken as the maximum likelihood estimate derived
in Section 2.3.1. That is,
S(N -1)
L
where we have replaced No by No-l to remove the bias.
2.4.2 An Estimate for f (0)
------------------ -
Burnham and Anderson (1976) suggested four possible
methods for estimating fy(0), but we are not aware of any
work which investigates the theoretical properties of any
of these estimates. Loftsgaarden and Quesenberry (1965) con-
sidered a density function estimate based on the observation
that
hat F (x+h) Fy(x-h)
fy(x) = lim 2h
h-0 h
where F y() is the cumulative distribution function. For the
purpose of estimating fy(0), their estimate takes the form
Ff(0) = {(O (' + ) (2.15)
where [ITfo + 11] is the value of ,Ton + 1 rounded off to the
nearest integer and Y is the j order statistic of the
sample ylY2',. -n
Loftsgaarden and Quesenberry (1965) showed that f (0)
as given in (2.15) is a consistent estimate, provided fy(-)
is a positive and continuous probability density function.
One nice property of y(0) is that it can be easily calcu-
lated from the data. However, evaluation of the moments of
this estimate does present some problems. In fact, the mean
and the variance may not even exist in some cases. But,
whenever ['TJo + 1] 3, i.e., whenever (Jo,,, the variance of
fy(0) is finite as shown in the following theorem.
Theorem 3. Let Y 1 Y2' .. n be a set of independent, iden-
tically distributed random variables, representing the right
angle distances, with continuous probability density function
(p.d.f.)
fy(y) = c v>O
Also, let Y( be the rh order statistic. Then
(r)r)
Efor e y i er + h
for every integer r, such that 3 r n.
Proof: The density function for Y(r) is
hr(y) = n- FrI y)[l-F(y) n-rfy)
where
y
F (y) = fy(t)dt.
Therefore,
E-) = n1 r- (y)[1-Fy(y) n-rdF,
Since g(y) represents a probability, g(y)sl and
F(y) g(t) dt < y
F(y) c C
Therefore,
nr-
E- 2 Fr(y)[l-Fy(y)]n-rdFy),
(rY c 0 y
(n-l\
Sr- r (r-2)r(n-r+l)
c r(n-1)
which completes the proof.
Simple asymptotic approximations for the mean and variance
of y(0) which work well for several densities given in
Table 2 can be developed using the first order terms in a
formal Taylor series expansion of Jy(0). The basic ideas
involved in the derivation of these approximations are pre-
sented in the following section.
2.4.3 Approximations for the Mean and Variance of f,,(0)
-1-
Let F(-) and F- (*) denote the cumulative distribution
function and its inverse for the random variable Y, the right
angle distance. Also, let
r = ['N + 1],
U = F(Y ),
r (r) '
and
+(U ) =
(r-l)F (U )
r
th
where Y(r) is the r order statistic in a random sample of
size N from F(-). Then proceeding as in Lindgren (1968,
p. 409) it is easy to see that
f,,(0) = ,(U ),
E(U) -- Pr'
and
Pr(1-Pr)
Var(Ur) N+2 (2.16)
Assuming that q(-) is continuous and differentiable once at
Pr, the first order terms in the Taylor series expansion of
F(*) at Pr yields the approximation
.(U .) t ,(p ) + (U -p ) (u) (2.17)
r du u=P (2.17)
Taking expectaLions on both sides of (2.17) yields
EI t.,(U )} J l (pr),
r -
and substicucing for r, pr and g(-) fields
E{ (O0) ) -1
S- F- 1N+1
/ N /+1
Taking the limit as N tends to infinity and noting that
F (0) is 0 and u = F(y) yields
lim E{f ,(0) lim
N- co Y I -co
] =
u=0
SdF( o) ( ).
dy y=0
(2.18)
Thus for large N, fy(0) is approximately unbiased.
An appro:-:imation for the variance of 'y(0) is found
in a similar fashion. Using (2.17) we get
Var{((Ur)) -d, upu r 2Var(Ur).
--- u=pr
Evaluating the derivative yields
d., (u) 1
du u=pr (r-l){F- (pr f
so that
Varf.+(Ur)} r -1 1 -
r (r-l)F (r Pr)
Var (U ).
r
How, using (2.16) and then making the appropriate substitu-
tions for r, pr and +*(*), we get
1 (,/ +i) (14- ,'N)
Var(f (0))- >. -
-1 2
TIN+y IY N+1 I
12
Therefore, as [1-u we have
lim ,'T(Var .',(0)} = f (O)
so that an approximation for the variance, when N is large,
is given by
f2(0)
Var(,,(0)} (2.19)
As stated earlier, the expressions obtained for the
expected value and variance of f,,(0) are only approximations.
Their adequacy for practical purposes may be evaluated by a
Monte Carlo study involving various specific forms for the
p.d.f., f,(-). In the next section we will look at the results
of just this kind of simulation study.
2.4.4 A Monte Carlo Study
A Monte Carlo study was used to examine the approximations
for E{ F,(0)) and Var(y,(O)} presented in Section 2.4.3. Three
possible shapes for Cy(-) were used in the study. Since the
shape of fy(.) depends solely on the choice of g(y), the
functions
g1(y) = e-10y, y>O
g2(y) = 1-y O
and
g3(y) = l-y2, 0
were chosen. The function gl(-) was first proposed by Gates
et al. (1968), while Eberhardt (1968) suggested both g2(-)
and g3(*). The different shapes that these three functions
represent are depicted in Figure 6.
1
82
g(y)
1 y
Figure 6. Three forms for the function g(y).
For each value of n= 25, 35, 45, 65, 80 and 100, two
thousand random samples of size n were selected from each of
the three populations defined by gi('), i= 1,2,3. These
samples were obtained by first generating observations from
a uniform distribution defined on the interval [0,1] and then
transforming these values using the appropriate density fy(*).
The UNIFORM function described in Barr et al. (1976) was used
to generate the samples from the uniform distribution. For
each set of 2000 samples, empirical estimates were calculated
for the expected value, e, the percent bias, Be, and the
standard deviation, ae, of fy(0) given in equation (2.15)
as follows:
Let fiY(0) denote the estimate from the ih sample,
i= 1,2,...,2000. Then
12000
e 1 i (0)
qe = -0 fiY(0)
i=Li
B = 00( fy(O)
e f Y(0)
and
1 2000 ( 21/2
ae i= 9 1 (f y(0)- e)-=l
All of the necessary computing was performed under release
76.6B of SAS (see Barr et al., 1976) at the Northeast
Regional Data Center located at the University of Florida.
The results of the study, along with the approximate
standard deviations,
fy(0)
T -
are presented in Tables 3, 4, and 5. As can be seen from the
tables, the estimate of fy(0) has a negative bias for most
samples, generally of a magnitude less than 10% of the true value.
The ratio of OT/oe is also within 10% of one for almost
all samples considered. This is even true for the smaller
sample sizes, n<45. Also, when considering the smaller sample
sizes, the ratio was for the most part greater than one.
Based on the results of this simulation, we feel that, in
practice, the approximations obtained for the expected value
and variance of fy(0) would perform adequately.
Table 3. Results of Monte Carlo Study using gl(y) = e-
Sample a
Size e B oT o T
e e Te --
e
25 9.05 -9.5 4.47 4.55 .98
35 9.08 -9.2 4.11 4.12 1.00
45 8.87 -11.3 3.86 3.65 1.06
65 9.48 -5.2 3.52 3.65 .96
80 9.49 -5.1 3.35 3.56 .94
100 9.48 -5.2 3.16 3.13 1.01
For gl(y) the theoretical mean is 10.
Table 4. Results of Monte Carlo Study using g2(y) = l-y.
Sample B T aT
Size e e T e --
e
25 1.88 -6.0 .894 .850 1.05
35 1.88 -6.0 .822 .801 1.03
45 1.83 -8.5 .772 .674 1.15
65 1.96 -2.0 .704 .722 .98
80 1.92 -4.0 .669 .647 1.03
100 1.93 -3.5 .632 .615 1.03
For g2(y) the theoretical mean is 2.
Table 5. Results of Monte Carlo Study using g3(y)= 1-y2
o
Sample B T
B o a --
Size e e T e
e
25 1.47 -2.0 .671 .625 1.07
35 1.48 -1.3 .616 .594 1.04
45 1.44 -4.0 .579 .559 1.04
65 1.51 .7 .528 .536 .99
80 1.49 .7 .502 .506 .99
100 1.50 0.0 .474 .477 .99
For g3(y) the theoretical mean is 1.5.
2.4.5 The Expected Value and Variance for
a Nonparametric Estimate of D
Now that we have decided upon an estimate for f (0), the
problem of estimating D is straightforward. Substituting
the estimate, fy(0), defined in Section 2.4.2 into expression
(2.14) a nonparametric estimate for D is
(N.-1)
D^ = No-l) (2.20)
2L/N '1 +11)
Expressions for the expected value and variance of 5N
are easily obtained. Since L and Y,Y ...'. Yo are indepen-
dent, we can write
1 ^
E(DN) = ; E(0)E{fy,(0)},
and (see Goodman, 1960, Eq. (2))
1 ar(D)
Var(N) = 4 (6)Varf (0)}+E'lf,(O))Var(6)+Var(6)Var(@'(0)],
where
^ N -1
L '
and
fy(0) = 1
/Y([/No+l])
Then, upon substituting the appropriate expressions for the
moments of 6 and fy(0) into the above equations, we get
E(DN) D, (2.21)
and
2 (/N +1)
ar(N (No+2) (2.22)
2.4.6 Sample Size Determination Using DN
We can now determine the approximate value of No that
is needed to guarantee some preset value for the coefficient
of variation of DN, CV(DN). These values for No can then
be compared to the corresponding values for No (see Table 1)
that are needed to ensure the same coefficient of variation
with the parametric estimate, D Using (2.21) and (2.22),
we see that an approximation for the coefficient of variation
of DN is
CV(6N) 1(-1/2
S(No+2)172
and by setting C=CV(DN), one can easily show that /No is the
root of the quadratic equation
C2No /No + 2C2-1 = 0.
Solving for .''o, yields the two roots
.. t(l-4C2 (2C 1))i/
fo= 2-------------
2C
and since
(1-4C2(2C2-1))1 2 >1
whenever
? 1
C- <
2'
the required sample size for values of C- .5 is
Sj1+(1-4C2 (2C -))1/22
2C2
For example, if C= .25, then No = 284. Table 6 gives values
for No corresponding to coefficients of variation ranging
from .2 to .5.
Table 6. Number of animals, No, that must be sighted
to guarantee the estimate DN has coefficient
of variation, CV(DQI).
CV(DN) No
.50 20
.40 48
.30 142
.25 284
.20 671
CHAPTER III
DENSITY ESTIMATION BASED ON A COMBINATION
OF INVERSE AND DIRECT SAMPLING
3.1 Introduction
When sampling a population by means of line transects,
it is important to keep in mind that the transect length
that can be covered by an observer will be finite. This
poses a problem for the inverse sampling plan since there
will exist the possibility of not seeing the specified number
of animals within the entire length of the transect. There-
fore, it seems reasonable to develop a sampling scheme that
would employ a rule, which allows one to stop when either a
specified number, No, of animals are seen or a fixed distance,
Lo, has been travelled on the transect. In this chapter we
will consider a sampling plan which combines the inverse
sampling procedure discussed in Chapter II and the direct
sampling procedure of Gates et al. (1968).
More precisely, we will define the combined sampling
method as follows:
1. Place a line at random across the area, A, to be
sampled
2. Specify a fixed number of animals, No>2, and a
fixed transect length, Lo, and then continue sampling
along the transect until either N animals are seen
or a distance, Lo, has been travelled.
49
Since the above method merely incorporates the individual
stopping rules from the inverse and direct sampling methods,
it seems reasonable to use the estimate
f D if H = INo
DCP = ^u (3.1)
D if N < No,
where N is a random variable corresponding to the actual num-
ber of animals sighted using combined sampling, D is the
inverse sampling estimator given in (2.7) and D is an esti-
g
mator appropriate for the direct sampling case. In other
words, the combined sampling procedure uses the inverse sam-
pling estimate if sampling terminates after No animals are
seen and the direct sampling estimate if sampling terminates
after travelling a distance Lo. In Section 3.5 we will also
show that DCp has a maximum likelihood justification.
Before proceeding to derive the mean and variance for
DCP, we need an estimate appropriate for the direct sampling
case.
3.2 Gates Estimate
Based on the direct sampling approach and assuming
g(y) =e- y, x 0,
Gates ce al. (1968) developed the estimate
0 n = 0,1
d 2(3.2)
D dn (n -1) n > 2 (
n
2L : ,
i=1
where Lo is the fixed length of the transect, n is an observed
value for the random variable Nd, the number of animals seen
using direct sampling, and yi is an observed value for the
random variable Yi, i =1,2,...,n, the right angle distance
to the ith animal seen. In what follows, we shall show that
the variance of Dd is not finite. First, we need a result
concerning the joint density of the Yi, i =1,2,...,Nd, condi-
tional on Nd.
Theorem 3. Under the assumptions stated in Section 2.2.1,
conditional on Nd=n>0, the random variables Y1,Y 2...,YNd
are independently, identically distributed with common density
fy(y) = Ae- y y>0, A>0.
Consequently, conditional on Nd = n>0, the random variable
Nd
E Y. has a Gamma distribution with parameters n and A.
i=l
Proof: We want to show that for yi >0, i= i,2,...,Nd,
n
-A I y.
i=l
fY 1 2 (YI'' Y2 'YN INd=n) =A ne
l..YN .... YNd "
Recall that in the direct sampling procedure, the total
length travelled, Lo, is fixed, and define L to be the random
variable representing the total length travelled on the tran-
sect when the nth animal is sighted. Then the events {Nd=n}
and {Ln Lo
fY1Y Y.d(yl''2.... ( I'Ndld=n)
1 2 L1 d
n
f', 1Y 2 Y ( yl ...1 nI -- Lo.
Now by Theorem 2, Y, Y .Y L and L are mutually
12''' n n n+l
independent, and
g(yi)
f (Yi) g
Consequently,
n
fY Y .Y *(y1 y2 N LNd=n) = H f (yi
1 2 d i=1 l
Sn
= -n R g(yi).
c 1=1
n e
i=l
which completes the proof.
It is now easy to show that Var (Dd) does not exist.
N d
From Theorem 3, conditional on 1d=n>0, Z Y. has a Gamma
i=l
distribution with parameters n and X. Thus, using (2.12)
and (3.2)
0 = 0, 1
.:(iilN d=n) = T 2 2
2 n 2
Lo (n-2)
Also, since Nd is the number of sightings inatransect of
length Lo, it follows from Theorem 2 that Nd has a Poisson
distribution with parameter OL. Thus
E(D2) = E E(D2Nd)
-A2 n2 (n-l) e 1(0L)n
Sn2 (n-2) n1
4Lo
= +0 ,
showing that the variance for the estimate Dd defined in
(3.2) is infinite. In fact as long as
P(Nd= 2) > 0,
the variance of Dd cannot be finite.
The problem of infinite variance for Dd can be overcome
by replacing Dd with D where
f0 if n=0,l,2
D =" (3. 3)
g n(n1) if n>3
2L E Y
i=l
Note that the estimate, D differs from Dd only when n=2.
Since any estimate of the density based on only 2 sightings
should be effectively 0, the above modification does not
seem to be unreasonable. We will now proceed to derive
expressions for the mean and variance of D which are needed
in the sequel.
3.2.1 The Mean and Variance of D
g
A
We will first examine E(D ). Recall from Theorem 3, that
id
conditional on 1d=n, n>0, Z Y. has a Gamma distribution with
i=l
parameters n and A. Thus
Nd, d = n-
i=l
and
0 n=0,l,2
E(Dg t l=n) =nA
1, nt3
Now since Ud is distributed as a Poisson random variable with
parameter OLo, it follows that
E(Dg) = E dE(Dg INd)
C OL n
ne O(OLo)
2L n!
0 n=3
O= {l-e (I+OLL o)}
2(
Substituting the left hand side of (2.5) for in the above
yields
E(D ) = )D l-c (1+0Lo)),
and after wriLing In = OL.o, the expected number of sightings in
a transect of length Lo, we get
E(D ) = D{1-o- (l+u) (3.4)
g
Thus,D is not strictly unbiased, but the bias arises because
g
there is a positive probability of obtaining samples of size
1 or 2. However, even for moderate values of v, the bias in
D will be small since e (1+p) tends to zero exponentially
g
fast. For example, if p = 10, the relative bias is only .05%.
Next we will look at Var (Dg). Again since, conditional
Nd g
on Nd=n, n>0, E Y. is distributed as a Gamma random variable,
i=l
with parameters n and A, we know that
1 2
E 1 =n X n>2,
N 2I Nd= (n-l)(n-2)' n>2,
E Y i
i=l 1
and
2 0 if n=0,1,2
E(D INd=n) = 2 (nl) 2
4 n (n-1) if n>3
Therefore,
E(D ) = ENdE(Dg Nd)
2 0 2 n
--- n (n-l) e- (3 5)
z. (n-2) nl
n=3
4Lo
and we can write
2 0 2 n- ln
Var(D ) = n (n-2) ne D -e +)}2 (3.6)
4Lo n=3
An approximation to Var(D ) valid for large values of u
may be derived in a manner analogous to the method used by
Gates et al. (1968). After writing
2
n (n-1) 2 4
(n-- = +n+2+-
it is easy to see that for n,3
2
n2 n+2 n (n-1) 2
(n-2)n+6
Thus, lower and
and
upper bounds for E(D ) are
g
.22 e n
LB = -- Z (n 2+n+2)e -- E(Db)
ALo n=3 n
B 2 2 -uj n ^
UB = =Z n n+n+6)eE
4Lo n=3 n
Now
2 m -ui n
UB-LB = n
Lo n=3
2 -
= -7 (1-e -e
2 -u
- o-).
Upon using the relationships
D = and = L
D = and I = 0Lo
we get
4D'
UB-LB =
ji
which tends to 0 as u-o
for E(Dg) is
8
(1-e -ll-ie
2- I)
u-e- ^
Thus, a reasonable approximation
L^ UB+LB
SE(D
g = 2
\
4Lo n=3
e-n
2 -e n
(n +n+4) --
n!-
1)
- (u 2+22u+4-e 11(4+6u+5ju )
U
(3.7)
From (3.7) an approximation for Var (D ) is
Var(D ) =D2{1+ 4 e ((5+ + )}-D {l-2e- (l+l+e-2 (1+p)2 }
2 2 4 -' 6 4 -2 2
=D { -+-- e (3-2u +-+ e (1+p) }
Now, as p increases, the terms involving e-P and e-2p will
2 4
tend to 0 much faster than -+-, so that for large u, we
have the approximation
Var(D ) D2 (-+ ). (3.8)
We are now in a position to derive the mean and variance
of Dp
3.3 Expected Value of DCp
Recall that in the combined sampling scheme both N, the
number of animals seen, and L, the distance travelled before
termination of sampling are random variables. Thus, the
expected value of DCp can be found directly using
E(Dp) = ENE(DCPIN).
However, before proceeding along these lines it will be help-
ful to have the following theorems.
Theorem 4. Let N be the random variable representing the
number of animals seen using the combined sampling method.
Then under the assumptions stated in Section 2.2.1,
58
f n
e n=0, ,...,N -1
P(N=n)=
om n
e 1I
S n=I]o
=Uo
where u=0Lo is the expected number of animals sighted along
a transect of length Lo.
Proof: For n
event {exactly n sightings occur in (0,Lo1). By Theorem 1,
the number of sightings in (0,Lo] is Poisson with parameter
OLo. Hence,
P(N=n) = enu n=0 ....... Io-1.
The case N=No follows since the event (N=N.J is equivalent
to the event (at least No sightings occur in (0,Lo]}.
The following three theorems establish some useful
relationships among the random variables I], LN andY ,Y2 .. YN
where 11 is as defined in Theorem 4, LI represents the total
t-h
length travelled on the transect when the th animal is
sighted and Yi, i=1,2,. ..,, represents the right angle
S th
distance to the i animal seen.
Theorem 5. Under the assumptions stated in Section 2.2.1,
the conditional p.d.f. of LN given N=n>0, is
n-1
nn
n n-i
Lo
fLN (|N=n) =
fLeN ( ) -le-0 n=N
F(No) P(N=N)
where
Lo NNoNo-1e -0.
P(N=No) = e dt.
'F(No)
Proof: First we will consider the case when n
Theorem 1, seeing n
to observing n Poisson events in the interval (O,Lo]. There-
fore (see Bhat, 1972, p. 129), the joint density of
L1,L2 ....,LN conditional on the occurrence of N=n Poisson
events in (0,L]o is
L1 L2 ..LN( 1 . JN=n) = nn Os
and the marginal density of LN conditional on N=n is
n-l
fLN(I|N=n) = n! ---, 0
Next, we will consider the case where N=No. Define T.
to be the random variable corresponding to the distance
travelled on the transect between the (i-l)st and ith sight-
ing. Then the Noth observation is made at
No
L = E T..
N0 i=1
60
Now in the combined sampling approach, we will see N=Noani-
mals, if and only if che distance
N
L = T.- L.
N0 i=l I
Therefore, if O<-sLo, then
P(L tNJ=N0o)
P ( [L.= 1,
P(LIj-I N=to) = P(.=)
P(i=No)
No
P( T. .)
i=l 1
P(N=No)
Now, since the sightings are Poisson events by Theorem i,
the random variable
n0
L = E T.
o i=l
has a Gamma distribution with parameters No and 0. Thus,
0[ No N -I -0?.
f (IjN=N) = o[ e ___ I 0
Now, by Theorem 4, we have
P(N=[) = e-0L (OLo)
j=0o j
OL-z ONo No-1
0 z dzo
SOe- 0 NO )(o-1
Sr(No) de.
Substituting P(N=No) into (3.9) above completes the proof.
Theorem 6. Under the assumptions stated in Section 2.2.1,
conditional on N=n>0, the random variables Y1,Y ... N and
LN are independent.
Proof: First consider the case N=No. Let 20 and y.i0
for i=l,2,...,N. We want to show that
P(Yi yi, LNt, i = 1,2,...,NIN=No)
= P(Y iYi, i=1,2,...,No)P(LN ]|N=No).
Note that the event {N=NJ is equivalent to the event {LNo Lo},
so that we can write
P(YiYi, LN : i = 1,...,NIN=No)
= P(Y iyi, LNos-, i = 1,2,...,No I=No)
= P(Y iYi, LNosz, i = 1,2,... ,No|LN sLo)
P(Yi Yi, LNo : LNoLo, i=1,2,. .. ,N )
P(LNo : Lo)
N0
Now by Theorem 2,
Consequently,
Y1Y2,...,YNo and LNo
are independent.
P(Y iYi, LNO < LN, Lo, i=1,2,...,NO)
P(LNo L0)
P(Y
P(LNo Lo)
= P(Y
= P(Y y. i=1,2,...,No)P(LNo IN=No0).
1 1- 0.
Now consider the case l=n<1o, and let 9. and v. be defined
as before. Also, define XN to be the actual length travelled
to see N animals when the combined sampling method is used,
that is,
= Lo '
L No N= N< o-0
L o, N=No
Then for n
are equivalent. Thus,
P(Yi-y L s., i = 1,2, .... ,N|N=n
= P(Y y L i = 1,2,...,nl:n=Lo)
1i n n
= P(' ., L t, i = 1,2,...,n L Lo
P(Y L -, L Lo
S- n n n+l)
n nn++
Again by Theorem 2, for N=n, YL 2" n and Ln Ln+
are independent, so that
P(Y.i
1 -i n n n+l.
P(Y iYi, i=1,2 .... ,n)P(Ln- Ln<- Lo.Ln+l)
P(L nLo
= P(Y.i .Y i=l,2,...,n)P(L <. N=n
Theorem 7. Under the assumptions stated in Section 2.2.1,
conditional on [t=n>0, the random variables Y1,Y2,..' are
independently, identically distributed with common density
63
fy(y) = Ae y>0, A>O.
Consequently, conditional on N=n>0, the random variable
N
Y. has a Gamma distribution with parameters n and X.
i=l
Proof: The case N=n
by noting that the random variables N and Nd are equivalent
when 0
o
is equivalent to the event {LNo
fYY2 .... YN (Y'Y2' ... 'YN IN=No)
1 f ,Y2 ..., No (Y Y21 . 'YNo ILNo Lo )
Now by Theorem 2, Y ,Y2 .... YNo and LNo are mutually inde-
pendent and
g(Yi)
fY(Yi) c
Consequently,
No
fYY2"'YNo (yl'y2' ..YN IN=N) H f (yi)
12*N. i=l i
1 No
= I g(yi),
c i=l
and substituting
-AYi
g(yi) = e
completes the proof.
We are now ready to determine the expected value of
DCp given N=n. For n = 0,1,2,
E(DCPIN=n) = E(D IN=n) = 0. (3.10)
g
Next consider the values 3~n,-No. Recall from Theorem 7
that, conditional on N=n>0, Y. has a Gamma distribution
i=l -
with parameters n and A. Then using expressions (3.1) and
(3.3) it follows that
E(DCpIN=n) = E(6 lN=n)
= E N(Nrl) il=n
2L.E Y.
n.\ 2 (3.11)
2L0
Finally for N=No it follows from Theorem 5, Theorem 6,
Theorem 7, and expressions (3.1) and (3.3) that
E(DcpI r=Nro) = E(Du J=No)
iE N-_1 ____ E '-1 [ "0
E Y L T
L 0 1 0-2 -
2i'(N=No ) o e *
Then using the transformation 9. = we get
e\0 I-, e (OL, )
E(D pN=No) =- 2P0(N=N) (3.12)
CP 2P (N nl A-12
n=N'. 1
We can now evaluate the expected value of DCp. Using
Theorem 4 and expressions (2.5), (3.10), (3.11), and (3.12),
we find that
E(DCP) = ENE(DcPIN)
n E L- P (N=n) + E nl
n3 n=No-l
= Dfl-e- (I+p)]. (3.13)
where i =0Lo.
Thus DCp is a biased estimate for the density. Note
that the bias here is equal to the bias for the modified
estimate, D in direct sampling. This is as expected since
in the combined sampling procedure, we are simply choosing
the estimate that corresponds to the reason for terminating
sampling. If we stop sampling after seeing the Noth animal,
then the inverse sampling estimate is used, and, likewise,
if sampling stops after travelling the distance Lo, then the
direct sampling estimate is used.
3.4 Variance of Dp
An expression for the variance of DCp can be found
directly using the formula
2 2
Var(DCp) = E(DCI) {E(D ,)}2. (3.14)
In the preceding section we derived E(Dcp) so that our
"2
problem reduces to evaluating E(DCp). Proceeding along
the same lines as in Section (3.3), we quickly find
2
E(DCPIN=n, n=0,1,2) =
E(D
CP N[=n, 2
(3.15)
S1 N=n, 2
*^
9 2
n' (n-L).
4Lo-(n-2)
E(DCpIrJ=No) = E
CP
(3.16)
N=N j
- E -IN=No E
LL
N2
1 '(N) -1) [
S4(No-2) 2P (N=No)
2 2''(No-l)"
2 2
4(N0-2) P(N=f0)
-, (N 2 I.) n=
4(No-2) P(N=N0) n=N
(N,- 1) 20 N -2. 3e- 9.
F (No)
0 4-2,No-3 -0.
e d.
-OL LO) n
e-L -(oLo)-
n9- .0(3.17)
<- 1
Then, using Theorem 4, expressions (2.5), (3.15), (3.L6), and
(3.17) and letting u=0Lo, it follows that
(2 2
E(IC) EN CP N)
2 o-1
4L- n=3
-, -ii n 2 22 / 2 2
I L(n- 1 ) c A I -
(n-2) nN 4 No-2
n=No-2
- n2 -1 n-2 -,u 2 2 2
,102 n u e A2 0 0-
4 (n-2) (n-2)! '-- -2
n=3 O
m -1 n
e nI
n!
n=No-2
and,
0 -l I 1
-II
S11="No
N-3
2 -3n -p 2 m -e n
= D2 (n+2) e+ (No- E e
nnN0-2 (3.18)
n=l n n1 n=No-2 n!
An expression for the variance of DCp is now evident.
Using (3.13), (3.14), and (3.18), we get
S 2 -n+2) -3 n2 -- / 2 co - n
Var(DCP)=D2 L (n 2)L N "- [ -e- l+u)]2
CP n=1 n=No-2 n
(3.19)
where p=9Lo.
Note that,
S rN_1 2 D2
lim Var(Dcp) = L -2 D
L 0-o
and
lim Var(D p) =D e -e (1+u)]
CP n n!
o-400 n=1
After some simple algebraic manipulations and using the
relationship D= one can easily show that the limit as
Lo-f and the limit as No--o are equal to the Var(D ) given
in (2.13) and the Var(D ) given in (3.6), respectively.
These limiting values are as expected, since letting Lo-0
in the combined sampling approach is equivalent to using
inverse sampling, while letting No-*o is equivalent co using
direct sampling.
We will now show that Var(DCP) can be expressed as a
function of both Var(6u) and Var(b ) given in (2.13) and
(3.6), respectively. Writing the equation in this form will
then lead directly to an approximation for Var(DCp).
68
First note that (3.19) can be rewritten as
Var(Dcp) No-3 (n+2) nn -u e 2 -1 -, n
CP (n+2) e IN-11 i e n -u 2
mn= n n +---+ 7 ---r-- ll-e (l+u)]
D n= n=No1-2
(3.20)
Adding and subtracting the terms
n=N-
n=N. -2
m n -u
IM (n+2) u e-
n=N -2 and
n=N.-2 n n!
-u n
e to the right hand side of (3.20) fields
to the right hand side of (3.20) yields
n]
Var(Dcp) o n -11
CP-_ = (n+2) e -u 2
D = I -e n +u)
D n=1
e n -
^n I ^I.-2/'
co -L n 0)
+ e z
n=No-2 n=N,-2
c -n
e uI 2N,-3 +
n=No-2 (No-2)
n -u
(n+2) u e
n nT
c n -
Z (n+2) I e
n=l n
O 2 n -u
-2 2 u e
1l-e (l+u)] 2
n=No-2 n n
Now, after multiplying both sides of the previous expression
2 0.\
by D using the relationship D= and substituting the
expressions for Var(D ) and Var(D ) given in (2.13) and (3.6),
respectively, we see that
11 n-
Var(Dp) = Var( ) +Var(D) D 2
C n=No-2 n u n=No-2n n
(3.21)
Therefore, an approximation for Var(DCP) can be simply ob-
tained by using the approximation for Var(D ) given in (3.8).
6
CO
+ Z
n=No,-2
3.5 Maximum Likelihood Justification for Dp
In Section 3.1 we stated that the estimate
D g N
D =
-DU N=No
could be justified using the maximum likelihood procedure.
To show this, we first need the joint density function for
YIY2 ... YN LN and N, i.e., fY,LN N(X,S,n).
By Theorem 6, fY,LNN() N can be written as
fY,LNN(y, ,n) =fy (yiN=n)fLN(JN=n)P(N=n).
The functional form for fY,L NN() is now evident. Using
Theorems 4 and 5 and recalling from Theorem 7 that
n
-A E Y.
fy(yxN=n) = ne i=l A>O,
we obtain,
/ n
-A E Y.
ne i=l n-1 -OLo n
nZ e (OLo)
Lo n,- N=n
YLN, N(,,n) =
No
N0
-A Y.
AN ei-1i N ,No-1 -A0
(No- )-I- N=No. (3.22)
As shown in Section 2.3. .themaximum likelihood estimate
for D is given by
OX
D 2
where 0 and \ are maximum likelihood estimates for 0 and ,\
respectively. Finding maximum likelihood estimates for
6 and A is now straightforward. Taking the natural logarithm
of the likelihood function and setting the partial derivatives
with respect to 6 and equal to 0 yields, for n>0,
N=n
96 =
and
Y N
Z Y.
,
I N=l, . N .
Thus, a maximum likelihood estimate for D using combined
sampling would be
9
N2
E'l
i=l 1
N=n
6 *=
2L Y.
[ 1i=
Our estimate for DCp is obtained by correcting the estimates
A and for bias and noting that 9711=n onl
y
j
exists for values of N>2.
CHAPTER IV
DENSITY ESTIMATION FOR CLUSTERED POPULATIONS
4.1 Introduction
The estimation procedures developed in Chapters II and
III are based on the assumption that the sightings of ani-
mals are independent events. These methods would be appli-
cable to animal populations that are generally made up of
solitary individuals, such as ruffed grouse, snowshoe hare
and gila monster. However, there are other types of animals
which aggregate into coveys, schools and other tight groups.
Animals behaving in this way will be said to belong to clus-
tered populations. Some examples of clustered populations
are bobwhite quail, gray partridge and porpoise. In these
cases the assumption of independent sightings is certainly
not valid, and a different procedure would have to be used.
The line transect method could be easily generalized to
provide estimates for clustered populations. As noted by
Anderson et al. (1976, p. 12), if we amend the assumptions
in Section 2.2.1 so that they refer to clusters of animals
rather than individual animals, then the results of Chap-
ters II and III are directly applicable to the estimation
of the cluster density, Dc. The estimate for Dc will be
71
based on the right angle distances to the clusters from the
random line transect. In the case where the number of ani-
mals in every sighted cluster can be determined without error,
an estimate for the population density D is given by
D = D s
where Dc is the estimate for Dc and s is the average size
of the observed clusters.
Some criticisms of the approach outlined in the preced-
ing paragraph are possible. First of all, it may not be
possible to determine the distance to a cluster as easily
(or as accurately) as the distance to an animal. How will
this distance be defined? Secondly, the simple modification
of the assumptions in Section 2.2.1, obtained by replacing
the word "animal" by the word "cluster" would imply that the
probability of sighting a cluster depends only on its right
angle distance from the line. This may not be a reasonable
assumption since the probability of sighting a larger cluster
is likely to be greater than the probability of sighting
a smaller cluster. Finally, the sighting of a cluster may
not necessarily mean that all of the animals comprising the
cluster are seen and counted by the observer. In this case,
a more reasonable assumption would be to let the probability
of sighting an animal belonging to a cluster depend on the
distance to the cluster as well as the true cluster size.
In this chapter we shall propose a density estimate for
a clustered population by assuming, among other things, that
it is possible to determine the distance to the center of the
cluster from the line transect. An estimation procedure will
then be developed using a model in which the observer's count
of the number of animals in a cluster is regarded as a random
variable with a probability distribution depending upon the
right angle distance and the size of the cluster.
4.2 Assumptions
The density estimate that we will develop is based on
the inverse sampling approach outlined in Section 2.1, with
one minor modification. In clustered populations the plan
is to continue sampling along a randomly placed cransect
until a prespecified number, Nc, of clusters (rather than
animals) are seen. As each cluster is sighted, the follow-
ing information is recorded:
1. the right angle distance, y, from the transect
to the center of the cluster
2. the observed number of animals, s, in the cluster.
(this may be less than the true size of the cluster)
3. the actual distance, Z, travelled by the observer
to sight N clusters.
The sampling procedure described above may be used to
construct an estimate of the population density under the
following set of assumptions. These assumptions closely
parallel those of Section 2.2.1 with the exception chat
they are now phrased in terms of clusters rather than indi-
vidual animals.
Bl. The clusters are randomly distributed with rate
(density) D over the area of interest, A.
B2. The clusters are independently distributed over A,
i.e., given two disjoint regions of area, 6A1 and
6A2'
P(n1 clusters are in .A1 and n2 clusters are in 6A2)
= P(nI clusters are in 6A1)P(n2 clusters are in 6A2).
B3. Clusters are fixed, i.e., there is no confusion
over clusters moving during sampling and none are
counted twice.
B4. There exists a probability mass function p(-)
defined on the set of positive integers, such that
p(r) is the probability that r is the true size
of a cluster located at a right angle distance, y,
from the transect. Note that p(r) is independent
of y. In probability notation, if R and Y denote
the random variables representing the true cluster
size and the right angle distance to the cluster,
respectively, then
P(R=rjY=y) = p(r), r = 1,2,... (4.1)
B5. The probability of observing a cluster depends
only on the size of the cluster and the distance
from the transect to the cluster.
B6. There exists a non-negative function h(-) defined
on 10,w) such that
0 5 h(.) < 1,
h(0) = 0,
and the probability of observing s animals belong-
ing to a cluster of size r 2 s located at a right
angle distance y from the transect is
(r) [h(y) ls l-h(y) r-s
That is, if Y and S denote the random variables
represent ng tLhe right angle di:sLtance to a cluster
and the observed number of animals in a cluster,
respectively, then
P(S=slR=r,Y=y) = (r)[h(y)Is[-h(y) r-s. (4.2)
\s
Closer examination of assumption B6 shows that we are
now allowing the probability of observing a cluster to depend
on both the right angle distance, y, and the true cluster
size, r. To see this, first let C be the event that a cluster
is observed. Then the probability of observing a cluster of
size r located at distance y from the transect is
r
P(CIR=r,Y=y) = E P(S=slR=r,Y=y)
s=l
= 1 P(S=0IR=r,Y=y)
= 1 [l-h(y)]r, (4.3)
which clearly depends on both y and r.
The assumption B6 also satisfies the reasonable require-
ment that for a fixed right angle distance y >0 and r1 < r2,
P(CIR=rl,Y=y) P (C R=r2,Y=y).
This follows immediately from equation (4.3). Note that
P(CIR=rl,Y=y) = 1 [l-h(y)] l
and
P(CIR=r2,Y=y) = 1 [l-h(y)]2
Now, since 0 h(y) 1, it is clear that
P(CIR=rl,Y=y) _P(CIR=r2,Y=y).
One final note with regard to assumption B6 is in order.
In the case where every cluster has size 1, i.e.,
P(R= 1) =1,
the probability of sighting a cluster located at a right
angle distance y is simply h(y). This is quickly seen by
setting r= 1 in (4.3). Thus, under these circumstances,
h(y) has the same interpretation as g(y) defined in Sec-
tion 2.2.1, that is, h(y) is the conditional probability of
sighting an animal at distance y given there is an animal
at v.
4.3 General Form of the Likelihood Function
We will use the maximum likelihood procedure to obtain
an estimate for D, the animal population density. To obtain
the likelihood function, we first need an expression for the
probability density function
fS,Y,L s'y' )
where S= (S1,S2, .... SN ) is the vector of random variables
c
representing the actual number of animals seen in the clusters,
Y= (Yi ,\Y ...2' Y ) is the vector of random variables repre-
c
senting the right angle distances from the clusters to the
transect and L is the random variable representing the total
length travelled on the transect to see H clusters. Upon
writ ing
fS ( = fS I 'Y, L(s)I'y) f [, L 1)fL (9(), (4.4)
S. 'i L S |YSI^,LilL- L
it is seen that specifying the joint probability density
function for S,Y and L is equivalent to specifying the three
functions on the right hand side of (4.4).
The density functions fyIL(ylk) and fL(k) may be derived
in a manner analogous to that used in Section 2.2. Let g (y)
denote the probability of sighting a cluster located at a
right angle distance y from the transect, that is
gc(y) =P(observe a cluster Y=y).
Since sighting a cluster located at a distance y is equiv-
alent to observing at least one animal belonging to the
cluster, we can write
gc(y) = E P(observe s animals Y=y)
s=l
= E P(S=s Y=y).
s=l
Now, for s l1,
P(S=slY=y) = E P(S=slR=r,Y=y)P(R=rlY=y).
r=s
By assumption B4, it follows that Y and R are independent
random variables. Thus, using (4.1) and (4.2) we get
P(S=sIY=y) = Z s (h(y)lsl-h(y)r-s p(r).
r=s
Therefore,
g (y) = s ( [h(y)s [l-h(y)r-s p(r). (4.5)
s=1 r=s
Now, according to assumption B6
h(0) = 1,
so that
g (0) = p(s) =1.
s=1
Therefore, the function gc(y) plays a role similar to the
role of g(y) in Section 2.2. Consequently, by regarding a
"cluster" as an "animal," the results of Section 2.2 can be
applied to clustered populations in a straightforward manner.
Let Nc () denote the random variable representing the
number of clusters seen when travelling a distance on the
transect. Then, by Theorem 1, Nc () is a Poisson process
with parameter 0*L, where 0* is the expected number of
clusters seen when travelling along a transect of length Z.
Also, from Theorem 1 we see that the respective analogs to
equations (2.2) and (2.1) are
0
D (4.6)
c 2c
where Dc is the density of clusters and
Cr
c' = gc y)dy. (4.7)
Furthermore, Theorem 2 gives us the results that L and Y are
mutually independent random variables, L is distributed as
a Gamma random variable with parameters N and 0 and the
conditional density of Y given L= R is
N
IL( fY) IT c gc(Yi). (4.8)
(c*) c i=l
Now, assumption B5 implies that the number of animals
actually observed in a cluster depends only on the right
angle distances to the animals, Y, and the size of the
cluster, R. Thus, S is independent of L, and since Y is
also independent of L it follows that
fSIY,L (sly,) = fsly(S Y)
Nc
= n P(S =s .Y.=y.). (4.9)
i=l i
We can now write an expression for the likelihood func-
tion L(e*,p(.),h(.);s,y,t). Using (4.4), (4.8) and (4.9)
and recalling that L has a Gamma distribution with param-
eters N and 0 we obtain
c
N
L(6*,p(.),h(.);s,y,t) = H P(S=sjY=y)
N N Nl -
H gc(Yi)(O*) c Z c e
x i= (4.10)
(c*) c F(N )
4.4 Estimation of D when p(*) and h(.)
Have Specific Forms
For a clustered population with a cluster density, D ,
the animal population density may be defined as
D=Dc ,
where v =E(R) is the expected cluster size. Upon using the
expression for D given in (4.6), we get
0*v
D= 0 (4.11)
2c
so that maximum likelihood estimation of D can be carried out
by using (4.11) in the likelihood function presented in (4.10).
Since the random variables S and Y are independent of L,
it is easily seen from (4.10) that the maximum likelihood
estimate of O' corrected for bias is
-1
S (4.12)
However, finding estimates of v and c can be quite difficult
depending upon the nature of the functions p(.) and h(-).
Very likely, one has to resort to some iterative technique
such as the Newton-Raphson method (see Korn and Korn, 1968,
eqn. (20.2-31)) to solve the likelihood equation.
It is apparent that there exist a wide variety of func-
tions which satisfy the requirements of p(-) and h(-).
The appropriate choice in a particular problem depends on
the nature of the population under investigation. In this
work we will consider the functions
r C
p(r) = c e a > 0, r = 1,2,... (4.13)
rl(1-e )
and
h(y) = e- > 0, v > 0. (4.14)
It is easily seen that p(.) given by (4.13) represents a
truncated Poisson distribution. The expected cluster size
v is therefore given by
SC (4.15)
l-e-
The limiting case a = 0 corresponds to a population in which
the cluster size is 1 with probability 1. Thus, c =0 corre-
sponds to the model in Section 2.2.
The choice for h(-) is based on the fact that when a=0,
h(-) may be interpreted as the function g(*) defined in
Chapter II. Because
g(y) = e- y
seems to be a popular choice for g(*), we feel that
h(y) = e y
is a reasonable choice for h(-).
The likelihood function may now be regarded as a function
of 0*, a and X*, and maximum likelihood estimation of
D -
D 0 *
2c*
may be accomplished by expressing v and c as functions of
a and A We have already seen the form of v in equation
(4.15). To derive an expression for c we proceed as follows.
Recall from (4.5) that
g (y) = E P(S=sIY=y),
s=l
where
P(S=s|Y=y) = s) [h(y)]s[1-h(y)]r-s p(r).
r=s
Now using (4.13) and (4.14) in the above equation, we get
( -s
r\ 4 yst -A 'k f r -a
s e -e
P(S=sJY=y)= E s
r=s rl(1-e-)
(ae- ~ys -*y r-s
S(ae y)s e- [ra (1-e y)]
-e ) (r-s)1
-* -A y
(ee y)s -e"
s(1ee-s (4.16)
sl(l-e -)
Then, substituting for P(S=s|Y=y), we get
-oe y -A v s
g(y) = e e
(I-e ) s=l1
l-e
ea (4.17)
(l-e )
Therefore, using (4.7) and (4.17), we get
c g c(y)dy
--eA
--z) dy (4.18)
(L-e a)
To evaluate c note that
0 \ *y
I -- e )dy "li-e y
(-e )dy = lim { e dy) (4.19)
,3 X -*+? *-3o0
By letting
t = e
in the integral in the right hand side of (4.19), we can show
"" A N V 1 -aC t
ae dy 1 e dt
ee dy = t
e
1 (-1) aj(l-e ) j
V +--- g
A j=l j -j
and upon substituting inco (4.19), we get
o e 1 0 ( 1 ) a ( L e )
(l-e- )dy = lim L- c
yo j=1 j j!
Since the sum above is absolutely convergent, we can take
the limit inside the sum and obtain
o -a*e 1 J-
(l-e )dy = ), i)
0o j=l j
Then, using (4.18), it follows that
c = a) (4.20)
S(l-e-0)
where
a(a) = E ) (A.21)
j=l J
We can now write the likelihood function in terms of
6*, \* and a. Using equations (4.10), (4.16), (4.20) and
(4.21), we get
N N N *
c c c -A Yi
E s. y.s. -a e
i=l i=l i=l
a e e
L('*,X*,a;s,, ) = ;eN e
N c
(le-") c H s.!
(l-e ) i=l
N N -ae Y N N -1
( c c -ae )(0i c c9.
0. ) T (1-e )e
i=l
N
[a(a)] c F(Nc)
(4.22)
Using the likelihood function given in (4.22), we can
now obtain an estimate for the population density D.
Recall from (4.11) that we can write
*
0 V
D -
2c*
After substituting for c and v using equations (4.15),
(4.20) and (4.21), the expression for D becomes
D 0 ) k
2 2a-(i)
Thus, an estimate for D would be
2a(a)
where 0 X and a are maximum likelihood estimates for
0 A and a, respectively. As noted earlier in this section,
S and Y are independent of L so that O0 can be estimated
using equation (4.13). However, this still leaves us the
problem of estimating A* and a. Instead of estimating A*
and a separately, we can reparamecerize the likelihood
equation in (4.22) by letting
0
and
a = a.
Then, our estimate for D becomes
D = (4.23)
The advantage of this reparameterization is that it makes
use of the fact that L is independent of both S and Y. Thus,
the estimate for D given in (4.23) is now the product of two
independent esLimates, 0 which depends on L alone and D which
depends on S and Y. As a result the variance of can now
be found easily. Using the formula (see Goodman, 1960) for
the variance of the product of two independent estimates,
we get
Var(D ) =E (6)Var(p)+E (p)Var(6) +Var(6 )Var(p)
(4.24)
Since L is distributed as a Gamma random variable with
an* (^ *
parameters Nc and 0 exact expressions for E(6 ) and Var( )
can be obtained using (2.4) and (2.11), i.e.,
E(0 ) = ,
and
*2
Var(e ) -=N -- (4.25)
c
Expressions for the variance and expected value of p can be-
come quite complicated. An iterative scheme would be needed
to find the solutions for p and a that would maximize the
reparameterized version of the likelihood function given
in (4.22). There are computer programs available that can
provide maximum likelihood estimates for p and a along with
numerical approximations for the variance covariance matrix
of the estimates. In the next section we will demonstrate
the use of one such program with a set of hypothetical data.
4.5 A Worked Example
In this section we will present a worked example to
demonstrate the use of a computer program to find the estimate
D* and its approximate variance. Because we are not aware of
any real data that have been collected according to the sam-
pling plan described in Section 4.2, we shall use an artifi-
cial set of data in the example.
Suppose that sampling was continued until N = 25 clusters
were sighted, and that a transect length of = 25 miles was
needed to sight the 25 clusters. Suppose further that the
observed right angle distances and the cluster sizes were as
follows, where the first number in the pair is the right
angle distance, y, measured in yards and the second number
in the pair is the corresponding cluster size, s:
(1,1), (3,2), (7,1), (10,1), (2,3)
(5,5), (4,1), (7,2), (15,1), (22,1)
(6,1), (3,6), (2,1), (12,1), (28,3)
(9,2), (18,1), (36,7), (17,6), (5,1)
(4,1), (3,1), (8,2), (3,4), (13,1).
As noted in Seccion 4.4, an estimate for 0 is
N -1
c-
c = .96,
and an estimate for the variance of ,U is
Var(0 ) = .001.
c
In order to estimate p, the reparameterized version of
the likelihood function given in (4.22) will have to be max-
imized. The Fortran subroutine ZXMIN, found in IMSL (1979)
may be used for this purpose. This program uses a quasi-
Newton iterative procedure to find the minimum of a function.
Thus, we first need to take the negative of the likelihood
equation before we can use this subroutine to our advantage.
On output, this subroutine not only provides the values
at which the function is minimized, but also provides numer-
ical estimates for the second partial derivatives of the
function evaluated at the minimization point. Thus, when
used with the negative of the likelihood function this pro-
gram will provide the maximum likelihood estimates, p and i,
as well as the matrix of negative second partial derivatives
of the likelihood, L(*), evaluated at p and a. We will denote
this matrix by
2 2
/ 2IL(*) 2 ZnL()
2
-1
a2nL() aInL ( ) /a=a
\ a 9p p2 /
For our data, the use of the subroutine ZXMIN with
initial values ai = 2.24 and pI = .16 yielded
a = 2.844,
p = .0907,
and
88
S 7.687 -161.229
-161.229 5098.985
The initial value used for a was the mean of the observed
cluster sizes, i.e.,
25
r S.
i=l -
L 25- = s.
25
Since our model does not assume all animals belonging to a
cluster are seen, s would underestimate the expected cluster
size, i.e.,
s < E(R) .
1-e
Thus, s seems to be a good starting value for a.
In choosing an initial value for P, first recall that
0 = a(O- ,
where a(a) is given in equation (4.21). Since our initial
value for a is s, all we need is a starting value for A.
If every animal in the cluster was seen with probability 1,
the density of clusters would be estimated by the method
described in Chapter II. In this case, the maximum likelihood
estimate for .\ would be 1/7 where
25
y' =-
Thus, as the initial value for p we used
S
1 -
ya(s)
The estimate for the density can now be calculated.
Using (4.23) and substituting the values we obtained for
6 and p, we get
A*
D = 76.7 animals/square mile.
Now if we can obtain a large sample approximation for
the variance of p, then we can use (4.24) as an approxima-
tion for the variance of D Now, under the usual regularity
conditions, V will be a large sample approximation to the
inverse of the variance-covariance matrix of a and p. Further-
more, the approximate variance of D* can be obtained from
equation (4.24) after substituting the element in the matrix
corresponding to the approximate variance of p along with
the other appropriate quantities. Straightforward calcula-
tions show that
SVar(D*) 26.2 animals/square mile.
The use of this Fortran subroutine required a minimal
amount of programming to enter the appropriate likelihood
function. It was run using the computer facilities of the
Northeast Regional Data Center located in Gainesvillc, Florida.
Less than two seconds of CPU time was needed for the estimates
to converge to values that agreed to four significant digits
on two successive interations.
BIBLIOGRAPHY
Anderson, D. R., Laake, J. L., Crain, B. R., and Burnham, K. P.
(1976), Guidelines for Line Transect Sampling of Biological
Populations, Logan: Utah Cooperative Wildlife Research Unit.
Anderson, D. R., and Pospahala, R. S. (1970), "Correction of
Bias in Belt Transect Studies of Immotile Objects," Journal
of Wildlife Management, 34, 141-146.
Barr, A. J., Goodnight, J. H., Sail, J. P., and Helwig, J. T.
(1976), A User's Guide to SAS 76, Raleigh: SAS Institute.
Bhat, U. N. (1972), Elements of Applied Stochastic Processes,
New York: John Wiley & Sons.
Burnham, K. P., and Anderson, D. R. (1976), "Mathematical
Models for Nonparametric Inferences from Line Transect
Data," Biometrics, 32, 325-336.
Crain, B. R., Burnham, K. P., Anderson, D. R., and Laake, J. L.
(1978), A Fourier Series Estimator of Population Density
for Line Transect Sampling, Logan: Utah State University
Press.
Eberhardt, L. L. (1968), "A Preliminary Appraisal of Line
Transects," Journal of Wildlife Management, 32, 82-88.
Gates, C. E., Marshall, W. H., and Olson, D. P. (1968),
"Line Transect Method of Estimating Grouse Population
Densities," Biometrics, 24, 135-145.
Goodman, L. A. (1960), "On the Exact Variance of Products,"
Journal of the American Statistical Association, 55,
708-713.
Ilayne, D. W. (1949), "An Examination of the Strip Census
Method for Estimating Animal Populations," Journal of
Wildlife Managemcnt, 13, 145-157.
IMSL (1979), The IMSL Library, Seventh ed., Vol. 3, Houston:
International Mathematical and Statistical Libraries, Inc.
Korn, G. A., and Korn, T. M. (1968), Mathematical Handbook
for Scientists and Engineers, Second ed., New York:
McGraw-Hill.
Leopold, A. (1933), Game Management, New York: Charles
Scribner's Sons.
Lindgren, B. W. (1968), Statistical Theory, Second ed.,
New York: Macmillan.
Loftsgaarden, D. 0., and Quesenberry, C. P. (1965), A Non-
parametric Estimate of a Multivariate Density Function,"
Annals of Mathematical Statistics, 36, 1049-1051.
Pielou, E. C. (1969), An Introduction to Mathematical Ecology,
New York: John Wiley & Sons.
Pollock, K. H. (1978), "A Family of Density Estimators for
Line Transect Sampling," Biometrics, 34, 475-478.
Robinette, W. L., Jones, D. A. Gashwiler, J. S., and Aldous,
C. M. (1954), "Methods for Censusing Winter-Lost Deer,"
North American Wildlife Conference Transactions, 19,
511-524.
Robinette, W. L., Loveless, C. M., and Jones, D. A. (1974),
"Field Tests of Strip Census Methods," Journal of Wild-
life Management, 38, 81-96.
Seber, G. A. F. (1973), The Estimation of Animal Abundance
and Related Parameters, London: Griffin.
Sen, A. R., Tournigny, J., and Smith, G. E. J. (1974), "On the
Line Transect Sampling Method," Biometrics, 30, 329-340.
Smith, M. H., Gardner, R. H., Gentry, J. B., Kaufman, D. W.,
and O'Farrel, M. H. (1975), Small Mammals: Their Pro-
ductivity and Population Dynamics, International Biolog-
ical Program.
Webb. W. L. (1942), "Notes on a Method of Censusing Snowshoe
Hare Populations," Journal of Wildlife Management, 6,
67-69.
BIOGRAPHICAL SKETCH
John Anthony Ondrasik was born on August 17, 1951, in
New Brunswick, New Jersey. Shortly thereafter his parents
moved to Palmerton, Pennsylvania, where he grew up and
attended high school. After graduation in June, 1969, he
entered Bucknell University in Lewisburg, Pennsylvania, and
received the degree of Bachelor of Science with a major in
mathematics in June, 1973.
It was during his studies at Bucknell that he became
interested in statistics through the influence of the late
Professor Paul Benson. In September, 1973, he matriculated
in the graduate school at the University of Florida and
received the degree Master of Statistics in 1975.
-. lile pursuing his graduate studies, he worked for the
Department of Statistics as an assistant in their biosta-
tistics consulting unit. In November, 1978, he accepted the
position of biostatistician with Boehringer Ingelheim, Ltd.
John Ondrasik is married to the former Anntoinette M.
Lucia. Currently they reside in Danbury, Connecticut.
|
Full Text |
PAGE 1
POPULATION DENSITY ESTIMATION USING LINE TRANSECT SAMPLING BY JOHN A. ONDRASIK A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1979
PAGE 2
To Toni For Her Love and Support
PAGE 3
ACKNOWLEDGMENTS I would like to thank my adviser, Dr. P. V. Rao, for his guidance and assistance throughout the course of this research, His patience and thoughtful advice during the writing of this dissertation is sincerely appreciated. I would also like to thank Dr. Dennis D. Wackerly for the help and encouragement that he provided during my years at the University of Florida. Special thanks go to my family for the moral support they provided during the pursuit of this degree. I am especially grateful to my wife, Toni, whose love and understanding made it possible for me to finish this project. Her patience and sacrifices will never be forgotten. Finally, I want to express my thanks to Mrs. Edna Larrick for her excellent job of typing this manuscript despite the time constraints involved. Ill
PAGE 4
TABLE OF CONTENTS Page iii ACKNOWLEDGMENTS LIST OF TABLES vi ABSTRACT vii CHAPTER I INTRODUCTION 1 1.1 Literature Review 1 1.2 Density Estimation Using Line Transects . . 4 1.3 Sununary of Results 9 II DENSITY ESTIMATION USING THE INVERSE SAMPLING PROCEDURE 13 2.1 Introduction 13 2.2 A General Model Based on Right Angle Distances and Transect Length 14 2.2.1 Assumptions 15 2.2.2 Derivation of the Likelihood Function 16 2.3 A Parametric Density Estimate 28 2.3.1 Maximum Likelihood Estimate for D . 28 2.3.2 Unbiased Estimate for D 29 2.3.3 Variance of 6 31 2.3.4 Sample Size Determination Using fi . 32 2.4 Nonparametric Density Estimate 34 2.4.1 The Nonparametric Model for Estimating D 36 2.4.2 An Estimate for fyCO) 37 2.4.3 Approximations for the Mean and Variance of t^iO) 40 2.4.4 A Monte Carlo Study 42 2.4.5 The Expected Value and Variance for a Nonparametric Estimate of D. . . . 46 2.4.6 Sample Size Determination Using t)^ . 47 IV
PAGE 5
TABLE OF CONTENTS (Continued) CHAPTER Page III DENSITY ESTIMATION BASED ON A COMBINATION OF INVERSE AND DIRECT SAMPLING 49 3.1 Introduction 49 3.2 Gates Estimate 50 3.2.1 The Mean and Variance of 6 54 3.3 Expected Value of D„p 57 3.4 Variance of D„p 55 3.5 Maximum Likelihood Justification for Dpp. . 69 IV DENSITY ESTIMATION FOR CLUSTERED POPULATIONS . . 71 4.1 Introduction ji 4.2 Assumptions 73 4.3 General Form of the Likelihood Function . . 76 4.4 Estimation of D when p(«) and h(*) Have Specific Forms 79 4.5 A Worked Example [ [ 86 BIBLIOGRAPHY gO BIOGRAPHICAL SKETCH 92
PAGE 6
LIST OF TABLES TABLE Page 1 Number of animals, N^ , that must be sighted to guarantee the estimate, D has coefficient of variation, CV(D ) 34 2 Forms proposed for the function, g(y) 36 3 Results of Monte Carlo Study using g-,(y) = e~ ^ . 45 4 Results of Monte Carlo Study using goCy) = 1-y . . 2 5 Results of Monte Carlo Study using go(y)=l-y 45 46 6 Number of animals, Nq , that must be sighted to guarantee the estimate Dj^ has coefficient of variation, CV(Dj^) 48 VI
PAGE 7
Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy POPULATION DENSITY ESTIMATION USING LINE TRANSECT SAMPLING By John A. Ondrasik December 1979 Chairman: Pejaver V. Rao Major Department: Statistics The use of line transect methods in estimating animal and plant population densities has recently been receiving increased attention in the literature. Many of the density estimates which are currently available are based only on the right angle distances from the sighted objects to a randomly placed transect of known length. This type of sampling, wherein an observer is required to travel along a line transect of some predetermined length, will be referred to as the direct sampling method. In contrast, one can use an inverse sampling plan which will allow the observer to terminate sampling as soon as he sights a prespecified number of animals . An obvious advantage of an inverse sampling plan is that sampling is terminated as soon as the required number of objects are sighted. A disadvantage is the possibility that sampling may not terminate in any reasonable period of time. Consequently, a third sampling plan, in which sampling stops vii
PAGE 8
as soon as either a prespecified number of objects are sighted or a prespecified length of the transect is traversed, is of practical interest. Such a sampling plan will be referred to as the combined sampling method. The objective of this dissertation is to develop density estimation techniques suitable for both inverse and combined sampling plans. In Chapter II, both a parametric and a nonparametric estimate for the population density are developed using the inverse sampling approach. We will show that a primary advantage of estimation using inverse sampling is the fact that these estimates can be expressed as the product of two independent random variables. This representation not only enables us to obtain the expected value and variance of. our estimates easily, but also leads to a simple criterion for sample size determination. In Chapter III, we derive a parametric density estimate that is suitable for the combined sampling method. This estimate will be sho\>m to be asymptotically unbiased. An approximation to the variance of this estimate is also provided. The density estimates developed in Chapters II and III are based on the assumption that the sightings of animals are independent events. In Chapter IV we relax this assumption and develop an estimation procedure usiny, inverse sampling that can be applied to clustered populations-those populations composed of small groups or "clusters" of objects. Vlll
PAGE 9
CHAPTER I INTRODUCTION 1 . 1 Literature Review Our objective in this dissertation is to examine the problem of density estimation in animal and plant populations. The demand for new and more efficient population density estimates has grown quite rapidly in the past few years. Anderson et al. (1976, p. 1) give a good assessment of the present situation and provide some reasons for the renewed interest in this subject in the following paragraph: _ The need to accurately and precisely estimate the size or density of biological populations has increased dramatically in recent years. This has been due largely to ecological problems created by the effects of man's rapidly increasing population. Within the past decade, we have witnessed numerous data gathering activities related to the Environmental Impact Statement (NEPA) process or Biological Monitoring programs. Environmental programs related to phosphate, uranium and coal mining and the extraction of shale oil typically require estimates of the size or density of biological populations. The Endangered Species Act has focused attention on the lack of techniques to estimate population size. It now appears that hundreds of species of plants may be protected under the Act, and, therefore, we will need information on the size of some plant populations. Estimation oP the size of biological populations was a maior objective of the International Biological Program (IBP) (Smith et al. 1975). Finally, we mention that the ability to estimate population size or density is fundamental to efficient wildlife and habitat management and many important studies in basic ecological research.
PAGE 10
The estimation of population size has always been a very interesting and complex problem . For a recent review of the general subject area see Seber (1973). Although many of the methods described in Seber 's book are quite useful, they are frequently very expensive and time consuming. Estimation methods based on capture-recapture studies would fall into this category. A further problem with many estimation methods is that they are based on models requiring very restrictive assumptions which severely limit their use in analyzing and interpreting the data. For these reasons and others, line transect sampling schemes are becoming more and more popular. This method of sampling requires an observer to travel along a line transect that has been randomly placed through the area containing the population under study and to record certain measurements whenever a member of the population is sighted. There are several density estimation techniques available using line transect data; however, the full potential is yet to be realized. Density estimation through line transects is typically practical, rapid and inexpensive for a wide variety of populations. Published references to line transect studies date back to the method used by King (See Leopold, 1933) in the estimation of ruffed grouse populations. Since that time, numerous papers investigating line transect models have appeared, e.g., Webb (1942), Hayne (1949). Robinette et al. (1954), Gates et al. (1968), Anderson and Pospahala (1970),
PAGE 11
Sen et al. (1974), Burnham and Anderson (1976) and Grain et al. (1978). Since it is commonly assumed by these authors that the objects being sampled are fixed with respect to the transect, line transect models are best suited for either immotile populations, flushing populations (populations where the animal being observed makes a conspicuous response upon the approach of the observer) or slow moving populations. Examples of such populations are: (i) immotile birds' nests, dead deer and plants, (ii) flushing grouse, pheasants and quail, and (iii) slow moving desert tortoise and gila monster. The degree to which line transect methods can be applied to more motile populations, such as deer and hare, will depend on the degree to which the basic assumptions are met. In any case, one should proceed cautiously when using these models for motile populations. Despite the wide applicability of line transect methods, the estimation problem has only recently begun to receive rigorous treatment and attention from a statistical standpoint Gates et al. (1968) were the first to develop a density estimation procedure within a standard statistical framework. After making certain assumptions with regard to the probability of sighting an animal located at a given right angle distance from the transect, they rigorously derived a population density estimate. In addition, they were the first authors to provide an explicit form for the approximate sampling variance of their density estimate.
PAGE 12
4 While the assumptions of Gates et al. (1968) concerning the probability of sighting an animal did work well for the ruffed grouse populations they were studying, it is clear that the validity of their assumptions will be quite crucial in establishing the validity their density estimates. If the collected data fail to substantiate their assumptions, large biases could occur in the estimates as seen in Robinette et al. (1974). As a result. Sen et al. (1974) and Pollock (1978) relaxed the assumptions of Gates et al . (1968) by using more general forms for the sighting probability, while Burnham and Anderson (1976) developed a nonparametric approach as a means of providing a more robust estimation procedure. In the following sections, we will outline the general problem of density estimation using line transects, give our approach to the solution of this problem and summarize the results found in the remainder of this work. 1 . 2 Density Estimation Using Line Transects The line transect method is simply a means of sampling from some unknown population of objects that are spatially distributed. In the context of animal or plant population density estimation, these objects take the form of mammals, birds, plants, nests, etc., which are distributed over a particular area of interest. From this point on, our references will always be to animal populations with the understanding that the estimation methods we describe are applicable to all populations which satisfy the necessary assumptions .
PAGE 13
In the line transect sampling procedure, a line is randomly placed across an area, A. that contains the unknown population of interest. An observer follows the transect and records one or more of the following three pieces of information for each animal sighted: (i) The radial distance, r, from the observer to the animal . (ii) The right angle distance, y, from the animal to the line transect. (iii) The sighting angle, 6, between the line transect and the line joining the observer to the point at which the animal is sighted. These measurements are illustrated in Figure 1. Figure 1. Measurements recorded using line transect sampling, (Z is the position of an observer when an animal is sighted at X. XP is the line from the animal perpendicular to the transect.)
PAGE 14
In this work, we shall consider the problem of estimating population density using only the right angle distances. Because estimates depending only on right angle distances are easy and economical to use, such estimates have become very popular over the past several years. Before any estimation procedure based on right angle distances can be formulated, certain assumptions regarding the population of interest must be made. A set of assumptions used by several workers in the area is detailed in Section 2.2.1. One of the key assumptions in this set is that the probability of sighting an animal located at a right angle distance, y, from the transect can be represented by some nonincreasing function g(y), which satisfies the equality, g(0) =1. This function is simply a mathematical tool for dealing with the fact that animals located closer to the line transect will be seen more readily than animals located further away from the transect. An alternative method of dealing with this phenomenon is given by Anderson and Pospahala (1970). If g(y) is assumed to have some specific functional form determined by some unknown parameters, then the estimate is said to be parametric. On the other hand, if g(y) is left unspecified except for the requirements that it is nondecreasing and g(0) = I, then the estimate is said to be nonparametric Seber (1973) has shown that any density estimate based on right angle distances will have the form
PAGE 15
N D = ^^ s 2LoC' where N is a random variable representing the number of animals seen in a line transect of length Lq and c is an estimate for c, a parameter which depends on g(y) through the relation .00 c = g(y)dy, ^O By noting that the density is simply the number of animals present per unit of area, it is clear that c can be interpreted as one-half of the effective width of the strip actually covered by the observer as he moves along the transect. Further examination of D also points out that estimating the parameter c is the key to the estimation problem. At this time, we would like to point out that the range for the right angle distance, y, is allowed to go from to + «>, as seen in the integral on the right hand side of the equation for c. In practice, since we are considering only a finite area, A, there will most certainly be a maximum observation distance, W, perpendicular to the transect. However, if W is large enough so that the approximation g(y)ciy = s(y)dy (i.i) -'o 'o is reasonable, then letting y range in the interval [0,+ <") will not cause any real problems. In practical terms, this means that the probability of observing an animal located beyond the boundary, W, should be essentially zero.
PAGE 16
8 In most real life situations, W can be chosen large enough so that the approximation given in (1.1) is valid. Thus, in the chapters which follow, we will implicitly assume that relation (1.1) holds for the density estimates that we develop. Both parametric and nonparametric models have been used to derive an estimate for the parameter c, and, consequently, for the population density. In both cases, the estimate for c turns out to be a function of the observed right angle distances. In the parametric case, c will simply be a function of the parameters that define the function chosen for g(y) . Examples of parametric estimates are found in Gates et al. (1968), Sen et al . (1974) and Pollock (1978). Estimation using the nonparametric model is more complicated. Burnham and Anderson (1976) have shown that estimating — is equivalent to estimating fY(0), where fY(') is the conditional probability density function for right angle distance given an animal is sighted. Thus, the problem of finding a nonparametric estimate for the population density reduces to the problem of estimating a density function at a given point. Unfortunately, this problem has not received much attention in the literature. Burnham and Anderson (1976) suggest four possible estimates for fY(0), but the sampling variances associated with these estimates have not been established.
PAGE 17
Grain et al . (1978) have also considered the problem of estimating fyCO). They derive an estimate using a Fourier Series expansion to approximate the conditional probability density function fyCy). Although their procedure does not lead to a simple estimate, they do provide an approximation to its sampling variance. The line transect method and the corresponding population density estimates so far described require the observer to travel a predetermined distance, Lq, along the transect. This method will be called the direct sampling method. An alternative to the direct method is the inverse sampling method, wherein sampling is terminated as soon as a specified number. No, of animals are sighted. Clearly, in the direct method, the number of animals seen is a random variable and the total length travelled is a fixed quantity, while in inverse sampling method, the total length travelled is the random variable and the number of animals that must be seen is fixed. The main focus of this v;ork will be to develop density estimation techniques that are based on the inverse sampling method. In addition, we will consider the density estimation problem when a combination of the inverse and direct sampling plans is used. 1 . 3 Summary of Results In Chapter 2 we derive two estimates for the population density, D, using an inverse sampling scheme. The set of assumptions which justify the use of these estimates is
PAGE 18
10 similar to those used by Gates at al. (1968) and several others. The estimates have the form where No is the number of animals that must be seen before sampling terminates, L is a random variable representing the length travelled on the transect and c is as previously defined. Note the similarity of t) to D given in Section 1.2 The only difference between the two estimates is that in fi-jthe random variables are L and c, while in D they are N and c. s ' However, this difference gives the inverse sampling method a theoretical advantage over the direct sampling method. The random variables L and c will be seen to be independent while N and c are not. Thus, the estimate Dj is the product of two independent random variables, a fact which not only allows us to obtain its expected value and variance easily, but also leads to a simple criterion for sample size determination. Both a parametric and a nonparametric estimate for the animal population density are developed in Chapter II. In deriving the parametric estimate, the functional form assumed for g(y) is identical to the one used by Gates et al. (1968). Our parametric density estimate is shown to be unbiased and the exact variance of this estimate is also provided. In the nonparametric case we propose an estimate for fY(0) using the method developed by Loftsgaarden and Quesenberry (1965) . We then use heuristic reasons to show
PAGE 19
11 that the corresponding density estimate is asymptotically unbiased, and derive a large sample approximation for its variance . The inverse sampling method does have one drawback when there is little information available concerning the population to be studied, namely, there exists the possibility that an observer might have to cover a very long transect to sight No animals. To overcome this problem, we develop a parametric density estimate in Chapter III that is based on a combination of the inverse and the direct sampling procedures. In the combined sampling scheme, sampling is terminated when either a prespecified number, No, of animals are sighted or when a prespecified length, Lq , has been travelled along the transect. Thus, in combined sampling both the length travelled and the number of animals seen will be random variables. In deriving the density estimate based on the combined sampling method, we again use the functional form for g(y) proposed by Gates et al. (1968). This estimate is shown to be asymptotically unbiased. In addition, an approximate variance for this density estimate is provided. The density estimates developed in Chapters II and III are based on the assumption that the sightings of animals will be independent events. Gates et al. (1968) showed that this assumption failed to hold for the animal population they were studying. In Chapter IV we relax this assumption, and develop an estimate based on inverse sampling that can be
PAGE 20
12 applied to clustered populations--populations in which the animals aggregate into small groups or "clusters . " Since the estimation procedure developed will require the use of a high-speed computer, the last section of Chapter IV is devoted to a worked example to illustrate the computations that would be involved.
PAGE 21
CHAPTER II DENSITY ESTIMATION USING THE INVERSE SAMPLING PROCEDURE 2 . 1 Introduction In this chapter we shall propose estimates for animal population density based on an inverse sampling procedure. Unlike the direct sampling method considered by Gates et al. (1968) , the inverse sampling procedure specifies the number of animals that must be sighted before the sampling can be terminated. Thus, in the inverse case the number of animals sighted will be a fixed rather than a random quantity. A precise formulation of the inverse sampling method is as follows : 1. Place a line at random across the area, A, to be sampled. 2. Specify a fixed number, N,, , and sample along the line transect until N,, animals are observed. As one proceeds along the transect, certain measurements will be made. These will be denoted by y, , y^ , . . . , y^^ and Z , where y. is the right angle distance from the i animal observed to the transect and I is the total distance travelled along the transect during the observation period. A visual depiction of these measurements is given in Figure 2 13
PAGE 22
14 Figure 2. Measurements recorded using inverse sampling 2. 2 A General Model Based on Right Angle Distances and Transect Length The estimates for the density, D, that we will develop are based on the right angle distances, y^^ ,72 > • • • -yNo • ^'^^ the total distance, £, travelled along the transect. Two possible approaches to the estimation of D merit consideration. First, recall that density is defined as the number of animals present per unit of area, or equivalently the rate at which animals are distributed over some specific area. Therefore, we can write D N where A is the area of interest and N' is the total number of animals present in A. In the direct sampling approach
PAGE 23
15 the estimation of D is most often accomplished by first estimating N and then dividing by A. Seber (1973) shows k that any estimate of N based on direct sampling has the form ^* = NA 2Lot where N is a random variable denoting the number of animals seen, L^ is the length of the transect and c is an estimate of c, a parameter which depends on the probability of sighting an animal given its right angle distance from the transect. Note that in Seber 's estimate, N is random and L^, is fixed. It follows then, that Seber' s estimate for D does not depend explicitly on A and has the form N ^s ~ 2Ul • Therefore, the estimate of D is independent of the actual size of A, a property that any reasonable estimate of D should possess. As an alternative, D itself can be regarded as the basic parameter of interest and estimates for D can be derived directly. This is the approach taken by Burnham and Anderson (1976) and the one that we will follow in developing our estimates . 2.2.1 Assumptions The form of any estimate of D, the animal population density, will depend upon the type of assumptions we can
PAGE 24
16 make regarding the distribution of the animals to be censused and the nature of the observations that will be made. The assumptions our estimates will be based on are as follows: Al. The animals are randomly distributed with rate or density D over the area of interest A, i.e., the probability of a given animal being in a particular region of Area, 6A, is 6A/A. A2. The animals are independently distributed over A, i.e., given two disjoint regions of area, 6A, and 6A2. P(n-| animals are in 6A, and n^ animals are in 6A2) = P(n, animals are in 6A-j^)P(n2 animals are in 6A2) . A3. The probability of sighting an animal depends only on its distance from the transect. In addition, there exists a function g(y) giving the conditional probability of observing an animal given its right angle distance, y, from the transect. In probability notation, g(y) = P(observing an animal | y) . A4. g(0) = 1, i.e., animals on the line are seen with probability one. A5. Animals are fixed, i.e., there is no confusion over animals moving during sampling and none are counted twice. 2.2.2 D erivation of the likelihood Function We will use the maximum likelihood procedure to obtain an estimate for D. The joint density function we are interested in is fy l(Z' ^' No) where Y = (Y-,,Y2 Y^^ ) is the vector of random variables
PAGE 25
17 representing the right angle distances, L is the random variable representing the total length travelled, and No is the specified number of animals to be seen before sampling terminates. Since the dependence of the joint density on No is implicit throughout the rest of this chapter, it will be dropped from our notation for convenience. Thus, from now on we will denote the density as and all other expressions depending on No in this manner will be handled accordingly. The following two theorems will be very useful in the derivation of the likelihood function. Theorem 1 : Let N(Jl) denote the number of animals sighted in the interval (0,?.] along the transect. Then, N(£) is a Poisson process, and for some 6 > 0, P{N(Jl)=n} = ^ ^;^-' ' n = 0,l,2... . Note that the quantity 9J, equals the expected number of animals sighted per segment of length i, . Proof: In order to show that N(£.) is a Poisson process, we will show that the assumptions in Section 2.2.1 imply the postulates necessary for a Poisson process given in Lindgren (1968, p. 162). First, consider two disjoint intervals, £-, and 2,2' along the transect and the corresponding areas, A(£,) and A(i2) , enclosed by lines perpendicular to the transect as shown in Figure 3.
PAGE 26
18 Figure 3. Two disjoint areas along the transect Now let N, and N„ be random variables representing the total number of animals that occupy A(£,) and A(£„), respectively. By definition, N(£,) and N(J.2) ^^^^ the number of animals sighted in A(t-|) and A(Z^) , respectively. We know from assumption A2 that N, and N^ are independent, and from assumption A3 that sighting an animal depends only on its distance from the transect. Thus, N(S,,), which depends solely on N, and the distances to the N, animals from the transect, is independent of l^(^2^ > i-^-. the number of sightings that occur in two disjoint intervals along the transect are independent events. Next we will show that for every il > m ^ and any h> 0, N(il)-N(m) and N («.+h) -N (m+h) are identically distributed. First, note that the effective area sampled in seeing N(ii)-N(m) animals and N(£+h)-N(m+h) animals is equal to A(Jl-m) as seen in Figure 4 .
PAGE 27
19 Figure 4. Effective area sampled in seeing N(]l)-N(in) animals and N(2,+h) -N(m+h) animals. Therefore, by assumptions Al , A2 , and A3, and since the transect is dropped at random, it follows that P{N(£)-N(m) = j}=P{N(£+h)-N(m+h) = j}, j=0,l,2 Next we must show that for every £ > 0, and some 6 > 0, P{N(£) = 1} = e£ + o(£) , as £^0, where (£) is a function such that 1 o(£) r> lim \ = 0. £^0 ^ Again let A(£) be the area defined by £ on the transect. Now define B. to be the event {N(£> = i) and E. to be the event that there are exactly j animals in area A(£). Then it follows that P(B,) = E P(B,E ) 1 j=l ^ J = E P(BJE.)P(E.) j=l ^ J J
PAGE 28
20 Under assumptions Al and A2 , Pielou (1969, p. 81) has shown that ,-DA(OrnA/.MJ P(E.) = [DA(£)]-. j =0,1.2. Also, under assumptions Al , A2 and A3. Seber (1973, Eq. (2.6)) has shown that where g(y)dy, (2.1) Therefore, we can write P (B,) =2cDee"°^^^'^ + I P(BJE )P(E.). j=2 J J and if we show E P(B, |E,)P(E.) = o{l) j=2 ^ J J the proof will be complete. Note that I P(B, |E.)P(E.) < 1 CO „-DA(OrnA.nMJ [DA(0] j=2 1' j'^^ J j=2 T " „-DA(£) , . J-1 = DA(£) I ^ UmLU j=2 J' < DA(a) 5: " e-^^^^^DA(0]J-^ DA ( £ ) = DA(£) [1-e ^"^"^ ]
PAGE 29
21 For any finite area A, A(?,) is {I) , that is lim ^-^ < K, for some K > 0. Therefore , as Jf. -> 00 Z P(B, |E.)P(E.) = o{l) . j=2 ^ J J and, . upon writing e = 2cD, (2.2) we get , as I -^ Q , P(B-|^) = Ql + o{l) . Finally, we need to show that for every 1> 0, Z P{N(£) =n} = o(Ji) , as I -> 0. n>l Note that for all n > 1, we can write 00 P(B ) = E P(B E.) n . T n 1 00 = T. P(B |E.)P(E.) . n' J J j=n J J Again, by using the fact that k{i) ts {l) , it is easy to show that P(B ) = oO') , as £ ^ 0, and N(£) satisfies the four conditions necessary for a Poisson process . Before proceeding to the second theorem, we need to define the following random variables. Let T. denote the random
PAGE 30
22 variable corresponding to the distance travelled on the transect between sightings of the (i-1)^'' and i^ animals, i = 1 , 2 , . . . ,No . Then the total distance travelled is given by No L = I T. . i=l ^ The following theorem establishes the independence of Y and T, ,T„ T„ , for the case N^ = 2, and this fact enables us to derive the joint density function, fy t(x.i^)Theorem 2 . The random variables T , T2 , Y, and Y„ are mutually independent . Proof: In order to establish the independence of T, , T^ , Y, and Y2 we will derive the joint density ^T^,T2,Y^,Y2^^1'^2'yi'y2^ and show that it can be factored into four functions, each depending on only one of the random variables of interest. Let y, , y2 , t, , t2 , h-, , h^ , g, and g^ be non-negative real numbers such that h + h^
PAGE 31
72+52 ^2 Yl+gl ^1 23 II h 4-^hi 4+4 t-|+t2+h2 Figure 5. Areas defined by y-i ,72 . t-, , t2 , g-i , go .h-, , and h2 Now let P(h^.g^,h2'g2^ = P<^4. < ^2 ^t^+h^.y-^ < Y^ 0 1^1 2^2 i = 1,2 provided the limit exists. Nov>7 notice that the event whose probability we wish to find, namely ^4 ^"^1 '4. ^^I'^'l ^ Y^ ^y^ + gj^,t2 < T2 < t2 + h2.y2
PAGE 32
24 S^, the event {N(t, +t2) N(t,+h, ) = 0} S, , the event {N(t,+t2+h2) N (t-, +t2) = 1} and {y^ P(S3) = e-^(^^-^^) However, ?(Sy) and P(S,) are not so easily obtained. We will only show how to find PCS^), since P(S/) is found in a similar fashion. First, define S^to be the event that there are exactly j animals in area I. Then CO P(SO = T. P(S^S^.) ^ j=l ^ ^^ = .^^ P(S2|S2j)P(S2j). By assumptions Al and A2 , the number of animals located in area I will be distributed as a Poisson random variable with parameter 2Dg,h, , where D is the density of the animals (see Pielou, 1969, p. 81). Note, the factor of 2 comes in since area I can be found on both sides of the transect. Therefore,
PAGE 33
25 (2Dg^h^)J By assumption A3, P(S2j) = yi ^^-^— , j =0.1,2. P(S2|S2i) = g(y{). for some y-.
PAGE 34
26 In the same manner, we can show that the independence established in Theorem 2 will hold for any finite number of sightings, N„. In this case if T = (T^ . T2 . . . . , T^ ) and Y=(Y^,Y2, ^nJ' ^^^^ ^2.3) becomes N, >N„^N -e E°t. 1=1 ^ N f^^Y^^'i^) = 2''°D''°e n g(y.). i=l Upon using equation (2.2) in f^ y^^'^^ ' ^^ S^^ No -e E t. f (r ^7^ fi^o^ i=l ^ -No „° . , r-p Y^->^^ e °e c IT g(y.). -'i=l ^ Thus, the marginal distributions for T. and Y. are g(y.) and T. 1 -Gt. f^ (t.) = ee ^. t. > 0. Therefore, Tj^,T2 T^^ are independent, identically distributed (iid) as Exponential random variables with parameter 6, and No L = T. T. i = l ^ has a Gamma distribution with parameters N^, and 0, i.e., qNopN„-1 4)i Furthermore, L is independent of Y.
PAGE 35
27 The likelihood function for the estimation of 6 and c can now be obtained by taking the product of fj (i) and f^Cz) . i.e. , No N N„-l -0£ We will now outline how one can estimate D, the animal population density, from the likelihood function given in (2.4). As noted earlier, D is related to 6 and c by equation (2.2), i.e., D ^ Thus, the maximum likelihood estimate for D would be where G and c are maximum likelihood estimates of 6 and c, respectively, obtained from (2.4). Note that the estimate D is the ratio of two mutually independent random variables, one depending on L alone and the other depending on Y alone. This property will be found to be very useful when evaluating the moments of 6. We have now set the framework necessary for deriving an estimate of D. In the next section we shall obtain an estimate for D assuming that g(y) has a particular parametric form.
PAGE 36
28 2 . 3 A Parametric Density Estimate Any estimate for D that is derived after assuming an explicit function for g(y) will be called a parametric estimate. Gates et al . (1968), using direct sampling, derived an estimate for D assuming g(y) = e" ^. Using this same function for g(y), we will derive the corresponding estimate based on inverse sampling. 2.3.1 Maximum Likelihood Estimate for D To estimate D we need to estimate both 6 and c from the likelihood function (2.4). In this case g(y) = e'^^", y>0, A>0 so that Substituting for c in (2.2) yields D = ^ . (2.5) Also, by substituting for c in (2. A), the likelihood function becomes No L(0,A;^,f,) = A^°e "--^ ^ l^^j^ . 1^0, y.>0. (2.6) The joint maximum likelihood estimates for and A can now be easily obtained. The natural logarithm of the likelihood function is
PAGE 37
29 No lnL(0,X;y_,l) =li^lnX-X T. y .+li^lnO+(li^-l) Inl-Ql-lnT (N^) . i=l ^ Taking the partial derivatives with respect to 9 and A yields = tlo. _ 96 e ' dlnL(d,X;y_,l) No dX X i=i^i' Setting these equal to yields ^ IT • and N„ A = No Substituting these estimates for G and A in (2.5), the maximum likelihood estimate for D is seen to be 5 = 6A ^ N^ 2 N„ 2£ T. y. i=l ^ 2.3.2 Unbiased Estimate for D The expected value of the estimate D, developed in Section 2.3.1, is E(D) = E(^) -*^ ^ = 2" E(0)E(A) since 6 and A are independent. Using the fact that L has a Gamma distribution with parameters N^ and 0, we obtain
PAGE 38
30 E(e) = E(^) N„9 To derive an expression for E(A), first recall that Y-,,...,Y„ are iid with the common density f^(y) = ^ = xe-^y, y>0. No Therefore, Z Y. is distributed as a Gamma random variable i=l ^ with parameters Nq and A and E(A) = E NqA Independence of 6 and A now yields E(D) = ^ E(e)E(A) ,2 fNo 1
PAGE 39
31 2,3.3 Variance of D -u N, Due to the independence of L and I Y., the variance of i=l ^ D can be derived directly. We have Var (D ) = Var 2 1 (Nq-I) No 2L E Y. i=l ^ (N,-!)' Var 1 No L Z Y. i=l ^ (No-1)^ E ""^M No LEY. Li=l ^J / N, Since L and Z Y. are independent it follows that i=l ^ Var (D ) = (^-1)' u 4 / 1 \ VL^/
PAGE 40
32 ECi) = ' 1 \
PAGE 41
33 CV= ^ E(D) where o^ and E(D) denote, respectively, the standard deviation D and the expected value for the estimate, D. As one can see immediately, small values of CV are desirable since this indicates that the estimate has a small standard deviation relative to its expected value. With the inverse sampling method, the value of N^ needed to guarantee a preset value, C, for the coefficient of variation of D can be calculated easily. Using (2.7) and (2,13) U JO we see that, for No>2 Then, setting C = CV(D ), it is easily shown that N, is the root of the quadratic equation C^No^ (4C^+2)No+4C^+3 = 0. Solving for N, yields the two roots c Since the variance of D exists only for N^ > 2 , the required sample size is For example, if C=.25, then No = 35. Table 1 gives values of No corresponding to coefficients of variation ranging from .1 to .5.
PAGE 42
34 Table 1. Number of animals, N,, , that must be sighted to guarantee the estimate, D , has coefficient of variation, CV(D ) . u CV(D^) N, .50 11 .40 15 .30 25 .25 35 .20 53 .15 92 .10 203 2 .4 Nonparametric Density Estimate In this section we will consider a nonparametric estimate for the population density, D, using inverse sampling. In contrast to the parametric approach used in Section 2,3, the nonparametric approach leaves the function g(y), which represents the probability of observing an animal given its right angle distance, unspecified. In Section 2.2,2 we showed that an estimate for D is given by 2c where n and c are the estimates for 0, the expected number of sightings per unit length of the transect, and c defined as c = g(y)dy, -"o
PAGE 43
35 If g(y) is completely specified, except for some parameters, then the problem of estimating D reduces to the problem of estimating 6 and the parameters in g(y). In Section 2 . 3 we considered the specific case g(y) = e-^y. A drawback to this approach, where we specify a functional form for g(y), is that the function chosen must take into account the inherent detection difficulties that are present when a particular animal species is being sampled. If one examines the various forms that have been suggested for g(y), one quickly becomes aware of the problem of finding a form that is flexible enough to accommodate the many possibilities which exist. Some of the functions that have been proposed for g(y) are presented in Table 2. As seen in the table, the suggestions for g(y) represent a number of different shapes in an effort to reflect the nature of the animal being sampled and the type of ground cover being searched. Because of the problems that can arise in choosing a function for g(y), Burnham and Anderson (1976) considered a nonparametric approach as a means of avoiding the need for the specification of g(y). Leaving g(y) unspecified will allow the estimation procedure to depend on the observations tlKiL arc actually made, not on any particular model. Thus, a nonparametric model might provide a more robust estimation method, that is, an estimation method that could be applied to a much wider class of animal species.
PAGE 44
36 Table 2. Forms proposed for the function, g(y). Function Author g(y) = e" y , A>0 Gates et al. (1968) g(y) =< ^^ Eberhardt (1968) -Ay
PAGE 45
37 Now, if g(y) is left unspecified, then an estimate for may be obtained along the same lines Burnham and Anderson (1976) used in the case of direct sampling. By assumption A4, Hence, — equals the value of the fyC') evaluated at y=0, where fyC') is the probability density function for the right angle distance, Y, given an animal is seen. The problem of finding a nonparametric estimate for -, therefore, reduces to the problem of finding an estimate, i^yCO) . fo^ fyCO). An estimate for D will then be given by . e! (0) D = -4 . (2.14) where 6 may be taken as the maximum likelihood estimate derived in Section 2.3.1. That is, e = (Nq-D L where we have replaced No by Nq-I to remove the bias. 2.4.2 An Estimate for f y (0) Burnham and Anderson (1976) suggested four possible methods for estimating fY(0) , but we are not aware of any work which investigates the theoretical properties of any of these estimates. Loftsgaarden and Quesenberry (1965) considered a density function estimate based on the observation ^^^^ FY(x+h) FY(x-h) fy(x) = lim yr ^ h-0 ^ where Fy(') is the cumulative distribution function. For the purpose of estimating fY(0), their estimate takes the fo rm
PAGE 46
38 £^(0) = (/Nl Y(f^^,^)} -1 (2.15) where [/n7 +1] is the value of /N7 + 1 rounded off to the nearest integer and ^/\ is the j order statistic of the sample y^ , 72 , . . . , y^. Loftsgaarden and Quesenberry (1965) showed that fY(0) as given in (2.15) is a consistent estimate, provided fY(*) is a positive and continuous probability density function. One nice property of iyiO) is that it can be easily calculated from the data. However, evaluation of the moments of this estimate does present some problems. In fact, the mean and the variance may not even exist in some cases. But, whenever [ /n7 + 1] ^ 3, i.e., whenever No5?4, the variance of fY(0) is finite as sho\^m in the following theorem. Theorem 3 . Let Y, ,Y2,...,Y be a set of independent, identically distributed random variables, representing the right angle distances, with continuous probability density function (p.d.f.) fv(y) = ^^ ' y^o. Also, let Y. s be the r order statistic Then (r) < + 00 for every integer r, such that 3 < r < n
PAGE 47
39 Proof: The density function for Y. v is h^(y) = n(j:})F^-^(y)[l-FY(y)]^"'^fY(y) where 'n Therefore, e(^) = "(?:1)I3 Fr'
PAGE 48
40 2.4.3 Approximations for the Mean and Variance of f y(0) Let F(*) and F (•) denote the cumulative distribution function and its inverse for the random variable Y, the right angle distance. Also, let r = [/N + 1] , U = F(Y, 0, r (r) ' and
PAGE 49
41 Taking the limit as N tends to infinity and noting that F'^^CO) is and u = F(y) yields lim ElfyCO)} 1 lim 1 7W dF~-'(u) du u=0 ^ dF(y) dy y=0 = fyCO) (2.18) Thus for large N, fyCO) is approximately unbiased. An approximation for the variance of ^y^^^ ^^ found in a similar fashion. Using (2.17) we get 2, Var{(j)(U^)} |d0(u) u=p. Var(U ) . r Evaluating the derivative yields d4)(u) du u=p^ (r-l){F-^p^))^fY(p^) so that Var{
PAGE 50
42 Var{fY(0)}^ 4^"(^r[^vm)] l/ /N+l \1 I N+1/ W LyV N+lj_ ^ (/N+1) (N-/N) ^ (N+l)^(N+2) (>4y+l)(N-/N)N (N+l)^(N+2) Therefore, as N->o° we have lim /N{Var !„(0)} = fj(0) , N-x» Y so that an approximation for the variance, when N is large, is given by Var{?„(0)} = -^ ^ /N (2.19) As stated earlier, the expressions obtained for the expected value and variance of fY(0) are only approximations. Their adequacy for practical purposes may be evaluated by a Monte Carlo study involving various specific forms for the p.d.f., fY(*). In the next section we will look at the results of just this kind of simulation study. 2.4.4 A Monte Carlo Study A Monte Carlo study was used to examine the approximations for E{fY(0)} and Var{?Y(0)? presented in Section 2.4.3. Three possible shapes for fY(*) were used in the study. Since the shape of fY(*) depends solely on the choice of g(y), the functions &^(y) = e-^°y. y>0 g2(y) = 1-y , 0
PAGE 51
43 and g^(y) = 1-y^, 0
PAGE 52
44 standard deviation, o^, of fyCO) given in equation (2.15) as follows: Let f^yCO) denote the estimate from the i sample, i= 1,2, ... ,2000. Then -, 2000 ^e = mj^ r, fiY^O) 1=1 /y3-fY(0)x B^ = 100' "^ and , /2000 ^ 9\l/2 ^e = TM^,J, (f,Y(0)-.,)2j . All of the necessary computing was performed under release 76. 6B of SAS (see Barr et al., 1976) at the Northeast Regional Data Center located at the University of Florida. The results of the study, along with the approximate standard deviations, °T = fyCO) /N are presented in Tables 3, 4, and 5. As can be seen from the tables, the estimate of fY(0) has a negative bias for most samples, generally of a magnitude less than 107o of the true value The ratio of Oy/o is also within 10% of one for almost all samples considered. This is even true for the smaller sample sizes, n<45. Also, when considering the smaller sample sizes, the ratio was for the most part greater than one. Based on the results of this simulation, we feel that, in practice, the approximations obtained for the expected value and variance of fY(0) would perform adequately.
PAGE 53
45 Table 3.
PAGE 54
46 2 Table 5. Results of Monte Carlo Study using g^Cy) = 1-y Sample Size
PAGE 55
47 where N -1 L ' and 1 fv(0) = ^^([/N,+l]) Then, upon substituting the appropriate expressions for the moments of 6 and fyCO) into the above equations, we get E(D^) 1 D, (2.21) and Var(Dj^) . d2 -^^|^ . (2.22) 2.4.6 Sample Size Determination Using P ., We can now determine the approximate value of No that is needed to guarantee some preset value for the coefficient of variation of D„, CV(Dj,). These values for No can then be compared to the corresponding values for No (see Table 1) that are needed to ensure the same coefficient of variation with the parametric estimate, D . Using (2.21) and (2.22), we see that an approximation for the coefficient of variation of D„ is ^^^^N-* = ,„ ..a/2 • (No+2)^'^ and by setting C=CV(Dj^), one can easily show that /R^ is the root of the quadratic equation C^No /N7 + 2C^-1 = 0.
PAGE 56
48 Solving for /W^ yields the two roots 2? and since whenever (l-4C^(2C^-l))^/2 > 1 ^ 2' the required sample size for values of C< .5 i< N^ = r i+(l-AC^(2C^-l))^/ ^l 2 \ 2? For example, if C = .25, then N^ = 284. Table 6 gives values for No corresponding to coefficients of variation ranging from .2 to . 5 . Table 6. Number of animals, N^ , that must be sighted to guarantee the estimate D^, has coefficient of variation, CV(D„). CV(D^)
PAGE 57
CHAPTER III DENSITY ESTir-lATION BASED ON A COMBINATION OF INVERSE AND DIRECT SAMPLING 3. 1 Introduction When sampling a population by means of line transects. it is important to keep in mind that the transect length that can be covered by an observer will be finite. This poses a problem for the inverse sampling plan since there will exist the possibility of not seeing the specified number of animals within the entire length of the transect. Therefore, it seems reasonable to develop a sampling scheme that would employ a rule, which allows one to stop when either a specified number. N„. of animals are seen or a fixed distance, Lo, has been travelled on the transect. In this chapter we will consider a sampling plan which combines the inverse sampling procedure discussed in Chapter II and the direct sampling procedure of Gates et al. (1968). More precisely, we will define the combined sampling method as follows : 1. Place a line at random across the area. A to be sampled ' 2. Specify a fixed number of animals, N„>2 and a finnf ll^'^l^'''' length. L„, and then continue sampling ar/.^^t transect until either H, animals are seen or a distance, L„ , has been travelled. 49
PAGE 58
50 Since the above method merely incorporates the individual stopping rules from the inverse and direct sampling methods, it seems reasonable to use the estimate Td if N = No Dcp =\ ."^ (3.1) [Dg if N < Ne, where N is a random variable corresponding to the actual number of animals sighted using combined sampling, 6 is the inverse sampling estimator given in (2.7) and D is an esti6 mator appropriate for the direct sampling case. In other words, the combined sampling procedure uses the inverse sampling estimate if sampling terminates after N^, animals are seen and the direct sampling estimate if sampling terminates after travelling a distance Lq . In Section 3 . 5 we will also show that D„p has a maximum likelihood justification. Before proceeding to derive the mean and variance for Dpp , we need an estimate appropriate for the direct sampling case . 3 . 2 Gates Estimate Based on the direct sampling approach and assuming g(y) =e-^y, X > 0, Gates ct al. (1968) developed the estimate . r . n = 0.1 n(n-l) , n > 2, n 2Lo yy. i=l ^
PAGE 59
51 "here L. is the fixed length of the transect, n Is an observed value for the random variable N,, the number of animals seen using direct sampling, and y. is an observed value for the rando. variable Y.. 1-1,2 „, .^e right angle distance to the i' animal seen. In what follows, we shall show that the variance of D, is not finite. First, we need a result concerning the joint density of the Y . , 1-1,2 N,, conditional on N^. a IMosen^. Under the assumptions stated m Section 2.2.1, conditional on Nj=n>0, the random variables ^^.Y^ Y ' are Independently, identically distributed with cLon'd!nsity fyfy) = ie'-^y y>0, A>0. Consequently, conditional on N^ „>o, the random variable d .^^Y. has a Gamma distribution with parameters n and A. Proof: We want to show that for y. >0, i = i 2 N n -A Z y. d " Recall that in the direct sampling procedure, the total length travelled, L„ , is fixed, and define L^ to bo the random variable representing the total length travelled on the transect when the n'" animal is sighted. Then the events (N,-n) and fL^sL,
PAGE 60
"V2---Yn ^'^1'''2 ^N IV") 52 = f. Now by Theorem 2. Y-^ . Y^ . . . . . Y^. L^ and L^^^ are mutually independent, and \^yP ^ Consequently, i=l i n Y^Y2...Y^^^i-2 -N^'"d -IT .^g(yi) c 1=1 n n = An e-^yi. i=l which completes the proof, It is now easy to show that Var (D^) does not exist. ^d From Theorem 3. conditional on N =n>0. Z Y. has a Gamma i=l ^ distribution with parameters n and A . Thus, using (2.12) and (3.2) "O E(o2|N^.n) = . n = 0,1 2 ? n^(n-l)A^ o — T , n>2 l4Lo(n-2)
PAGE 61
53 Also, since N, is the number of sightings in a transect of length Lo , it follows from Theorem 2 that N, has a Poisson distribution with parameter OL,, . Thus E(D^) = E^ ECD^lN^) d _ A^ " n^(n-l) e'^^(9Lj" T ^' -(7^:21 HT^ showing that the variance for the estimate D , defined in (3.2) is infinite. In fact as long as P(N^ = 2) > 0. the variance of D, cannot be finite. The problem of infinite variance for D, can be overcome by replacing D, with D , where 6 J g n(n-l) n 2Lo T. Y i=l ^ if n=0,l,2 if n>3 (3.3) Note that the estimate, D , differs from D, only when n=2. g d -^ Since any estimate of the density based on only 2 sightings should be effectively 0, the above modification does not seem to be unreasonable. Wc will now proceed Lo derive expressions for the mean and variance of D , which are needed g in the sequel.
PAGE 62
i»4 3.2.1 The Mean and Variance of D g We will first examine E(D ). Recall from Theorem 3, that O ^d conditional on N,=n, n>0, Z Y. has a Gamma distribution with i=l ^ parameters n and A. Thus X E/ _J: zS 1=1^"d= HTT and , n=0.1.2 E(D IN =n) = ^ • "^ n.3 2L o Now since N^ is distributed as a Poisson random variable with parameter OL,, , it follows that E(Dg) = E E(DglN^) X ; ne-^L„^ n 2L, ^^3 = ^ {l-e-^^°(l+OLo)} . Substituting the left hand side of (2.5) for ^ in the above yields E(Dg) = D{l-e"''^°(l+ULo)}, and after writinp, p = OL^ , the expected number of sightings in a transect of length L^ , we get E(D ) = D{l-e"^'(l+u)}. (3.4)
PAGE 63
Thus.D is not strictly unbiased, but the bias arises because there is a positive probability of obtaining samples of size 1 or 2. However, even for moderate values of \\ , the bias in D will be small since e"'^(l+y) tends to zero exponentially g fast. For example, if y = 10, the relative bias is only .057o, Next we will look at Var (D ). Again since, conditional N g on N,=n, n>0, T. Y. is distributed as a Gamma random variable. i=l " with parameters n and 1 !^d (>) ilV^ we know that (n-l)(n-2) n>2. and E(D |N, g' d =n) J° n^(n-l)A^ if n=0,l,2 if n>3 Therefore, E(d') = E^ E(D |N^) g -\i_ n 4L V " (n-1) e u T '' (n-2) n! n= J (3.5) \i n Var(D ) = -^ T. M^ S^^ D^d-e-^l+u)) g AT 2 „_T (n-2) n! (3.6) and we can write 2 oo 2 Uhf n=3 An approximation to Var (D ) valid for largo values of \i may be derived in a manner analogous to the method used by Gates et al. (1968). After writing
PAGE 64
56 2 it is easy to see that for n^3 Thus, lower and upper bounds for E(D ) are LB = -^ E (n -tTi+2) ^ .^ E(D^) 4iy n=3 ^' S ,2 oo -p n UB-LB = -^ ?: ^ '^ T 2 ^ n! Lo n=3 L. Upon using the relationships ,2 2 -u — Y (1-e -ije — 2 — ) D = -y and \i = GLj we get 9 O UB-LB = ^ (l-e-"-,e-^ H-f^) . M which tends to as y->«>. Thus, a reasonable approximation for E(D-^) is g A^ : . 2. .,s e-"n^ y j; (n"+n+4) ^^ 4L, n=3 2 = ^ {p^+2ii+4-e"'''(4+6u+5y^)} (3.7)
PAGE 65
57 From (3.7) an approximation for Var (D ) is Var(D ) =D2{l+lf4--e"^(5+^ + 4)}-D^U-2e-^(l+u)+e'2^(l+y)2} fa ^ u ^ \i = d2(^ + ^e-^3-2p +1 + 4) e-2^1+,)2} . Now, as u increases, the terms involving e~^ and e~ will 9 A tend to much faster than + —7, so that for large y, we have the approximation Var(D ) 1 d2(^ + 4). (3.8) 6 P We are now in a position to derive the mean and variance of D^p_ 3 . 3 Expected Value of D ^p Recall that in the combined sampling scheme both N, the number of animals seen, and L, the distance travelled before termination of sampling are random variables. Thus, the expected value of T>^^ can be found directly using E(D(.p) = E^E(D^plN). However, before proceeding along these lines it will be helpful to have the following theorems. Theorem h . Let N be the random variable representing the number of animals seen using the combined sampling method. Then under the assumptions stated in Section 2,2,1,
PAGE 66
58 -u n e \i . n=0,l, N„-l n=N<, where vi = 6Lo is the expected number of animals sighted along a transect of length L^ . Proof: For n
PAGE 67
59 Theorem 5 . Under the assumptions stated in Section 2.2.1, the conditional p.d.f. of Lj^ given N=n:0, is .n-l ^1^^ . iKLo. n=l,2 No-1 Lo f. ()i|N=n) N 1 qNo^N,-1^-0£ rTNll • P(N=NJ ^^^°' "=^< where fl-o P(N=NJ Proof: First we will consider the case when n
PAGE 68
60 Now in the combined sampling approach, we will see N=No animals , if and only if the distance No L = E T
PAGE 69
61 Theorem 6 . Under the assumptions stated in Section 2.2.1, conditional on N=n>0, the random variables Y, , Y2 , . . . , Y^^ and Lj^ are independent. Proof: First consider the case N=No . Let ^^0 and y.^0 for i=l,2 N. We want to show that P(Y^
PAGE 70
62 Now consider the case N=n0, the random variables Yj^,Y2 Y^ are independently, identically distributed with common density
PAGE 71
63 fyCy) = Ae' y, y>0, A>0. consequently, conditional on N=n>0. the rando. variable .f^Y. has a Gamma distribution with parameters n and A. Proof: The case N=n
PAGE 72
64 We are now ready to determine the expected value of D„p given N=n. For n = 0,1,2, E(D |N=n) = E(D |N=n) = (3.10) Next consider the values 30, I Y. has a Gamma distribution i=l ^ with parameters n and A. Then using expressions (3.1) and (3.3) it follows that E(6^p|N=n) = E(6 |N=n) ^, N(N-l) ,,, , = El ^ |N=n| 12L„ 7. Y, i=l ^ nA 2L„ (3.11) Finally for N=No it follows from Theorem 5, Theorem 6, Theorem 7, and expressions (3.1) and (3.3) that E(D^p|N=NJ = E(D^|N=NJ 1 E
PAGE 73
65 We can now evaluate the expected value of D^p . Using Theorem 4 and expressions (2.5). (3.10). (3.11), and (3.12). CP d expressions (2.5). (3.10). (3.11), we find that ^^^CP) = V(D^p|N) = °^^ 4^ P(N=n)+^ V e:!Ns^ = D[l-e~^(l+y)] ^' (3.13) where \i =9Lo . Thus Dj,p Is a biased estimate for the density. Note that the bias here is equal to the bias for the modified estimate, D^, in direct sampling. This is as expected since in the combined sampling procedure, we are simply choosing the estimate that corresponds to the reason for terminating sampling. If „e stop sampling after seeing the n/^ animal, then the inverse sampling estimate is used, and, likewise, if sampling stops after travelling the distance L„ , then the direct sampling estimate is used. 3 ^ Variance of D An expression for the variance of 5^^ can be found directly usinj., the formula Var(6„) =E(6^p.„(6^__),2 ^^^^^ In the preceding section we derived E(6^,,) so that our problem reduces to evaluating ECD^p. Proceeding along the same lines as in Section (3.3), we quickly find
PAGE 74
66 E(D2p|N=n. n=0.1,2) = 0, E(D^p|N=n, 2
PAGE 75
67 = D No-3 (n+2) y"e-^ + n n! m) n=N„-2 -\i n e y nl (3.18) An expression for the variance of Dpp is now evident Using (3.13), (3.14), and (3.18), we get No-3 , ... n -u A, n\ 2 Var(D^p) =D' I (l}±2)iLe 7n^ , ^y _ [i_e-^(l+,)] n=l ^ ^^ V^o^y i,=No-2 ^^ (3.19) where y=GLo . Note that, and lim Var(D^p) = lim Var(D^p) = D' No"*-" N, ^vN, ^)'-^^ °° , .^x n -y o ^ (n+2)H_e _ -y 2 n=l n n After some simple algebraic manipulations and using the relationship D=-^, one can easily show that the limit as Lo-^°° and the limit as Nq-*"are equal to the Var(D ) given in (2.13) and the Var(D ) given in (3.6), respectively. These limiting values are as expected, since letting Lo-»-°° in the combined sampling approach is equivalent to using inverse sampling, while letting No-*-°° is equivalent to using direct sampling. We will now show that Var(Dpp) can be expressed as a function of both Var(6 ) and Var(6 ) given in (2.13) and (3.6), respectively. Writing the equation in this form will then lead directly to an approximation for Var(D„p).
PAGE 76
68 First note that (3.19) can be rewritten as Var(D^p) N,-3 2 «> D n=l " ^' ^'^° ^z n=No-2 "' (3.20) Adding and subtracting the terms E ^^"^^ i^— ^ _ M o n n n_-p n=N„-2 n! and 00 -jj n e u Z T — to the right hand side of (3.20) yields n=N,-2 '^' Var(D^^) CP^ _ ; (n+2) p"e"^ ,-, „-vi,. , x,2 D n=l n -^ [1-e ^'(1+y)] oo + z n=N„-2 ^[(^) -] + n=N„-2 -VI n e \i n! n=N„-2 (n+2) u"e-^ n nl n=No-2 -\i n e M 2No-3 [l-e~^(l+p)] (No-2)' 2 + E n=l (n+2) Me n n! oo n=N„-2 n -y n rTI Now, after multiplying both sides of the previous expression by D , using the relationship D = -^ and substituting the expressions for Var(D ) and Var(D ) given in (2.13) and (3.6), respectively, we see that Var(D^p) = n=No-2 ^-11 n o 1' Var(D ) +Var(D ) D Y. u n=N„-2 2 ij e n nl (3.21) Therefore, an approximation for Var(Dpp) can be simply obtained by using the approximation for Var(D ) given in (3.8). o
PAGE 77
69 ^•^ Maximum Likel i hnnH__h^oHfir nrf ,-,„ for D In Section 3.1 we stated that the estimate CP Dg , N ^^ now evident. Using Theorems 4 and 5 and recalling from Theorem 7 that n -A T. Y. fyCzlN^n) = A"e ^=^ \ x>0, we obtain. HT^^'^. N=n
PAGE 78
70 where and A are maximum likelihood estimates for and A, respectively. Finding maximum likelihood estimates for 6 and A is now straightforward. Taking the natural logarithm of the likelihood function and setting the partial derivatives with respect to 6 and A equal to yields, for n>0, e ={ _N_ Lo N=n2.
PAGE 79
CHAPTER IV DENSITY ESTIMATION FOR CLUSTERED POPULATIONS 4. I Introduction The estimation procedures developed in Chapters II and III are based on the assumption that the sightings of animals are independent events. These methods would be applicable to animal populations that are generally made up of solitary individuals, such as ruffed grouse, snowshoe hare and gila monster. However, there are other types of animals which aggregate into coveys, schools and other tight groups. Animals behaving in this way will be said to belong to clustered populations. Some examples of clustered populations are bobwhite quail, gray partridge and porpoise. In these cases the assumption of independent sightings is certainly not valid, and a different procedure would have to be used. The line transect method could be easily generalized to provide estimates for clustered populations. As noted by Anderson et al. (1976, p. 12), if we amend the assumptions in Section 2.2.1 so that they refer to clusters of animals rather than individual animals, then the results of Chapters II and III are directly applicable to the estimation of the cluster density, D . The estimate for D will be ' ' c c 71
PAGE 80
72 based on the right angle distances to the clusters from the random line transect. In the case where the number of animals in every sighted cluster can be determined without error an estimate for the population density D is given by D = D s c where D^ is the estimate for D and s is the average size of the observed clusters. Some criticisms of the approach outlined in the preceding paragraph are possible. First of all, it may not be possible to determine the distance to a cluster as easily (or as accurately) as the distance to an animal. How will this distance be defined? Secondly, the simple modification of the assumptions in Section 2.2.1, obtained by replacing the word "animal" by the word "cluster" would imply that the probability of sighting a cluster depends only on its right angle distance from the line. This may not be a reasonable assumption since the probability of sighting a larger cluster is likely to be greater than the probability of sighting a smaller cluster. Finally, the sighting of a cluster may not necessarily mean that all of the animals comprising the cluster are seen and counted by the observer. In this case, a more reasonable assumption would be to let the probability of sighting an animal belonging to a cluster depend on the distance to the cluster as well as the true cluster size. In this chapter we shall propose a density estimate for a clustered population by assuming, among other things, that
PAGE 81
73 it is possible to determine the distance to the center of the cluster from the line transect. An estimation procedure will then be developed using a model in which the observer's count of the number of animals in a cluster is regarded as a random variable with a probability distribution depending upon the right angle distance and the size of the cluster. 4, 2 Assumptions The density estimate that we will develop is based on the inverse sampling approach outlined in Section 2,1, with one minor modification. In clustered populations the plan is to continue sampling along a randomly placed transect until a prespecified number, N , of clusters (rather than animals) are seen. As each cluster is sighted, the following information is recorded: 1. the right angle distance, y, from the transect to the center of the cluster 2. the observed number of animals, s, in the cluster, (this may be less than the true size of the cluster) 3. the actual distance, I, travelled by the observer to sight N clusters. The sampling procedure described above may be used to construct an estimate of the population density under the following set of assumptions. These assumptions closely parallel those of Section 2.2.1 with the exception that they are now phrased in terms of clusters rather than individual animals.
PAGE 82
74 Bl. The clusters are randomly distributed with rate (density) D over the area of interest, A. B2 . The clusters are independently distributed over A, i.e., given two disjoint regions of area, 6A. and 6A2, 1 P(n-|^ clusters are in 6A, and 1X2 clusters are in 6Ap) = P(nj^ clusters are in 6A^)P(n2 clusters are in SA^) B3 . Clusters are fixed, i.e., there is no confusion over clusters moving during sampling and none are counted twice. BA. There exists a probability mass function p(') defined on the set of positive integers, such that p(r) is the probability that r is the true size of a cluster located at a right angle distance, y, from the transect. Note that p(r) is independent of y. In probability notation, if R and Y denote the random variables representing the true cluster size and the right angle distance to the cluster, respectively, then P(R=r|Y=y) = p(r), r = 1,2 (A.l) B5 . The probability of observing a cluster depends only on the size of the cluster and the distance from the transect to the cluster. 36. There exists a non-negative function h(*) defined on [0,«') such that < h(.) < 1, h(0) = 0, and the probability of observing s animals belonging to a cluster of size r > s located at a right angle distance y from the transect is it) (h(y))^[l-h(y)]''' That is, if Y and S denote the random variables representing the riglit anj^lc distance to a cluster and the observed number of animals in a cluster, respectively, then P(S=s|R=r,Y=y) = (g) (h(y) ] ^ 1-h (y) ] ''-^ (4.2)
PAGE 83
75 Closer examination of assumption R6 shows that we are now allowing the probability of observing a cluster to depend on both the right angle distance, y. and the true cluster size. r. To see this, first let C be the event that a cluster is observed. Then the probability of observing a cluster of size r located at distance y from the transect is P(C|R=r,Y=y) = z P(S=s | R=r . Y=y) s = l = 1 P(S=0|R=r,Y=y) = 1 [l-h(y)]-, (^3) which clearly depends on both y and r. The assumption B6 also satisfies the reasonable requirement that for a fixed right angle distance y > and r < r . P(C-'|R=r^,Y=y) < P (c| R=r2 , Y=y) . This follows immediately from equation (4.3). Note that P(C(R=r^.Y=y) = 1 _ [l-h{y)]^^, and P(c|R=r2.Y=y) = 1 [l-h(y)]'"2_ Now, since 0
PAGE 84
76 the probability of sighting a cluster located at a right angle distance y is simply h(y). This is quickly seen by setting r=l in (4.3). Thus, under these circumstances, h(y) has the same interpretation as g(y) defined in Section 2.2.1, that is, h(y) is the conditional probability of sighting an animal at distance y given there is an animal at y. 4. 3 General Form of the Likelihood Function We will use the maximum likelihood procedure to obtain an estimate for D, the animal population density. To obtain the likelihood function, we first need an expression for the probability density function where S = (S-, , Sj , . . . , S., ) is the vector of random variables c representing the actual number of animals seen in the clusters, Y = (Y, ,Y„ , . . . , Yj. ) is the vector of random variables reprec senting the right angle distances from the clusters to the transect and L is the random variable representing the total length travelled on the transect to see N^ clusters. Upon writing it is seen that specifying the joint probability density function for S,Y and L is equivalent to specifying the three functions on the right hand side of (4.4).
PAGE 85
77 The density functions fylL^^'^^ ^"^ ^L^^^ ^^^ ^^ derived in a manner analogous to that used in Section 2.2. Let g (y) denote the probability of sighting a cluster located at a right angle distance y from the transect, that is g^(y) = P (observe a cluster | Y=y) . Since sighting a cluster located at a distance y is equivalent to observing at least one animal belonging to the cluster, we can write CO gj,(y) = ^ P(observe s animals | Y=y) s = l oo = I P(S=s I Y=y). s = l Now, for s > 1, oo P (S=s I Y=y ) = Z P (S=s I R=r . Y=y ) P (R=r | Y=y) . r=s By assumption B4, it follows that Y and R are independent random variables. Thus, using (4.1) and (4.2) we get P(S=s|Y=y)= I (3)[h(y)]^[l-h(y)]^-%(r). Therefore, CO oo gc(y)= y^ y^ ls)[h(y)]^[l-h(y)]^-%(r). (4.5) s=l r=s ^ ' Now, according to assumption B6 h(0) = 1. so that oo g_(0) = Z p(s) =1. ^ s=l
PAGE 86
78 Therefore, the function g^(y) plays a role similar to the role of g(y) in Section 2.2. Consequently, by regarding a "cluster" as an "animal," the results of Section 2.2 can be applied to clustered populations in a straightforward manner. Let N (?,) denote the random variable representing the number of clusters seen when travelling a distance £ on the transect. Then, by Theorem 1, N (£) is a Poisson process with parameter e*iJ,, where Q* I is the expected number of clusters seen when travelling along a transect of length Z . Also, from Theorem 1 we see that the respective analogs to equations (2.2) and (2.1) are D =-^ , (4.6) ^ 2c where D is the density of clusters and c ' * c = g^(y)dy. (4.7) o Furthermore, Theorem 2 gives us the results that L and Y are mutually independent random variables, L is distributed as a Gamma random variable with parameters N and G and the '^ c conditional density of Y given L = S, is N fylLCzU) = fy^y^^ ^r ""^ P'c^^i^^"^-^^ -' ^ *x c i=l (c ) Now, assumption B5 implies that the number of animals actually observed in a cluster depends only on the right angle distances to the animals, Y, and the size of the cluster, R. Thus, S^ is independent of L, and since Y is
PAGE 87
79 also independent of L it follov^^s that Nc = i!i ^^h='i^\=yi^' (4.9) We can now write an expression for the likelihood function L(e *, p (•) ,h(. ) ;s,x. ^) . Using (4.4), (4.8) and (4.9) and recalling that L has a Gamma distribution with parameters N^ and * , we obtain N * c L(9 .p(-).h(.);s,^,£) = n P(S.=s. |Y.=y.) ~ i^=X "* 111 V i=l ^ N • (-^-10) (c*) ^ r(N^) 4-4 Estimation of D when p(') and h(-) Have Specific Forms For a clustered population with a cluster density, D^ , the animal population density may be defined as c D = D V c where v = E(R) is the expected cluster size. Upon using the expression for D^ given in (4.6), we get °-r^ ' (4.11) zc so that maximum likelihood estimation of D can be carried out by using (4.11) in the likelihood function presented in (4.10)
PAGE 88
80 Since the random variables S and Y are independent of L, it is easily seen from (4,10) that the maximum likelihood estimate of 6 corrected for bias is N -1 e* = -X• (4.12) However, finding estimates of v and c* can be quite difficult depending upon the nature of the functions p(*) and h('). Very likely, one has to resort to some iterative technique such as the Newton-Raphson method (see Korn and Korn, 1968, eqn. (20.2-31)) to solve the likelihood equation. It is apparent that there exist a wide variety of functions which satisfy the requirements of p(") and h('). The appropriate choice in a particular problem depends on the nature of the population under investigation. In this work we will consider the functions p(r) = " ^ ^ a> 0, r = 1.2,..., (4.13) r!(l-e"'') and it h(y) =e"^ >" X* > 0. y > 0. (4.14) It is easily seen that p(*) given by (4.13) represents a truncated Poisson distribution. The expected cluster size V is therefore given by " (4.15) 1-e The limiting case a = corresponds to a population in which the cluster size is 1 with probability 1. Thus, a = corresponds to the model in Section 2.2.
PAGE 89
81 The choice for h(') is based on the facC that when a = 0, h(') may be interpreted as the function g(*) defined in Chapter II. Because g(y) = e" y seems to be a popular choice for g(*), we feel that h(y) = e-^*y is a reasonable choice for h('). The likelihood function may now be regarded as a function of , a and A , and maximum likelihood estimation of 2c may be accomplished by expressing v and c as functions of a and A . We have already seen the form of v in equation (4.15). To derive an expression for c we proceed as follows, Recall from (4.5) that g_(y) = TP(S = s|Y=y), ^ s = l where P(S = s|Y=y)= E g [h(y)]^[l-h(y)]'""^ p(r) r=s Now using (4.13) and (4.14) in the above equation, we get /r\ -A ys/, -A "1 r -ci 0° I e ' \l-e } it e P(S=s|Y=y)= I i^i r=s rld-e'") _ (ae-^ y)" e-'^ ^ [a (l-e' ^ y ) ] ^'^ s!(l-e-^) r=s ^^^ /_ ^-A y. s -ae (ae ' ) e s! d-e"'') -Ay (4.16)
PAGE 90
82 Then, substituting for P(S=s|Y=y), we get g.(y) = -ae -X'y f A y^ s (ae ^ ) d-e"") 8 = 1 1-e ae .A*y d-e-") Therefore, using (4.7) and (4.17), we get .00 * c = gc(y)dy 1-e -ae X y o (l-e-«) dy To evaluate c , note that (1-e -ae \ y )dy = lim (x X-^-oo /•x o -ae •X*y dy} By letting (4.17) (4.18) (4.19) -^ y t = e •' in the integral in the right hand side of (4.19), we can show fX ^* J, ae -X-y X \\ -at e -X*X ^ dt = X + (-l)j aJ(l-e"^ '^)J A j=l j • jl and upon substituting into (4.19), we get (1-e ae -X y o ) dy = 1 im — Y. X-^^ X* j=l (-1)-^ Jq-e'^ ^)^
PAGE 91
83 Since the sum above is absolutely convergent, we can take the limit inside the sum and obtain roo , * (1-e >^y ^~ .\ j • j! O A j=l -J Then, using (4.18), it follows that c* = ^^'^^ , (4.20) A (1-e ) where a(a) = Z ^^^" . (4.21) 3=1 ^ ' ^' We can now write the likelihood function in terms of 6*, A* and a. Using equations (4.10), (4.16), (4.20) and (4. 21) , we get N N N * c ^ c c -A y^ Z s. -A Z y.s. -a y, e i=l ^ i=l ^^ ^ i=l L(e ,A ,a;s,^,£) = ^ N ^c , . -I a N c I] s . ! (1-e ) .^;l ^ * ^c ^c -ae-^*yi * ^c ^c"^ -9*11 (A ) ^ n'^ (1-e ""^ ^)(0 ) ^ P. e ^ ^i ^ ^ [a(a)] "^ r(N^) (4.22) Using the likelihood function given in (4.22), we can now obtain an estimate for the population density D. Recall from (4.11) that we can v-zrite D = -^-^ . 2c*
PAGE 92
84 After substituting for c* and v using equations (4.15), (4.20) and (4.21), the expression for D becomes • * Thus, an estimate for D would be /\ X y\ * /v 2a(S) where 6 , A and a are maximum likelihood estimates for * * 6 , A and a, respectively. As noted earlier in this section, S and Y are independent of L so that 0* can be estimated using equation (4.13). However, this still leaves us the problem of estimating A and a. Instead of estimating A* and a separately, we can reparameterize the likelihood equation in (4.22) by letting Aa and a = a . Then, our estimate for D becomes j^ a* ^ The advantage of this reparameterization is that it makes use of the fact that L is independent of both S and Y. Thus, the estimate for D given in (4.23) is now the product of two independent estimates, which depends on L alone and p which depends on S and Y. As a result the variance of D* can now be found easily. Using the formula (see Goodman, 1960) for
PAGE 93
85 the variance of the product of two Independent estimates, we get Var(D*) =1 4 '2/fi*^,, _,:x , ^2 E (e )Var(p)+E2(S)Var(6*)+Var(0*)Var(p)'] (4.24) Since L is distributed as a Gamma random variable with parameters N^ and o\ exact expressions for E(e *) and Var (6 *) can be obtained using (2.4) and (2.11), i.e.. E(o ) = e ^ and ''^^<'>>-^(.,25) Expressions for the variance and expected value of p" can become quite complicated. An iterative scheme would be needed to find the solutions for p and a that would maximize the reparameterized version of the likelihood function given m (4.22). There are computer programs available that can provide maximum likelihood estimates for p and . along with numerical approximations for the variance covariance matrix of the estimates. In the next section we will demonstrate the use of one such program with a set of hypothetical data.
PAGE 94
86 4 , 5 A Worked Example In this section we will present a worked example to demonstrate the use of a computer program to find the estimate D and its approximate variance. Because we are not aware of any real data that have been collected according to the sampling plan described in Section 4.2, we shall use an artificial set of data in the example. Suppose that sampling was continued until N = 25 clusters were sighted, and that a transect length of £ = 25 miles was needed to sight the 25 clusters. Suppose further that the observed right angle distances and the cluster sizes were as follows, where the first number in the pair is the right angle distance, y, measured in yards and the second number in the pair is the corresponding cluster size, s: (1.1), (3,2). (7,1), (10,1), (2,3) (5.5), (4.1). (7.2), (15,1). (22.1) (6.1). (3.6). (2.1). (12.1). (28,3) (9,2). (18.1), (36.7). (17.6). (5.1) (4.1), (3,1), (8.2), (3.4), (13.1). As noted in Section 4.4, an estimate for G is .. N -1 = = .96, and an estimate for the variance of is .0401. Var(0 ) = c
PAGE 95
87 In order to estimate p, the reparameterized version of the likelihood function given in (4.22) will have to be maximized. The Fortran subroutine ZXIIIN, found in IMSL (1979) may be used for this purpose. This program uses a quasiNewton iterative procedure to find the minimum of a function. Thus, v;e first need to take the negative of the likelihood equation before we can use this subroutine to our advantage. On output, this subroutine not only provides the values at which the function is minimized, but also provides numerical estimates for the second partial derivatives of the function evaluated at the minimization point. Thus, when used with the negative of the likelihood function this program will provide the maximum likelihood estimates, p and a, as well as the matrix of negative second partial derivatives of the likelihood, L(*), evaluated at p and a. We will denote this matrix by V -1 d InL(-) da d^lnL(') 9a 3p P=P 3p' 8a dp For our data, the use of tlio sul)r()uLine ZXMIN with initial values a^ = 2.24 and p-r = .16 yielded a = 2.844, P = .0907, and
PAGE 96
88 -, I 7.687 -161.229 \ -161.229 5098.985 The initial value used for a was the mean of the observed cluster sizes, i.e., 25 Z s. a, = i^L_^ = i . ^ 25 Since our model does not assume all animals belonging to a cluster are seen, s would underestimate the expected cluster size , i.e., i" < E(R) = " 1-e Thus, s seems to be a good starting value for a. In choosing an initial value for p, first recall that P = X a a(a) ' where a(a) is given in equation (4.21). Since our initial value for a is s, all we need is a starting value for A . If every animal in the cluster was seen with probability 1, the density of clusters would be estimated by the method described in Chapter II. In this case, the maximum likelihood estimalG for A would be 1/y where 25 i^/i
PAGE 97
89 Thus, as the initial value for p we used Pi ya(s) The estimate for the density can now be calculated. Using (4.23) and substituting the values we obtained for 6 and p, we get D =76.7 animals/square mile. Now if we can obtain a large sample approximation for the variance of p, then we can use (4.24) as an approximation for the variance of D . Now, under the usual regularity conditions, V will be a large sample approximation to the inverse of the variance-covariance matrix of a and p. Furthermore, the approximate variance of D can be obtained from equation (4.24) after substituting the element in the matrix corresponding to the approximate variance of p along with the other appropriate quantities. Straightforward calculations show that / Var(D*) ^26.2 animals/square mile. The use of this Fortran subroutine required a minimal amount of programming to enter the appropriate likelihood function. It was run using the computer facilities of the Northeast Regional Data Center located in Gainesville, Florida Less than two seconds of CPU time was needed for the estimates to converge to values that agreed to four significant digits on two successive interations.
PAGE 98
BIBLIOGRAPHY Anderson, D. R. , Laake , J. L. , Grain, B. R. , and Burnham, K. P. (1976) , Guidelines for Line Transect Sampling, of Biological Populations , Logan: Utah Gooperative VJildlife Research Unit, Anderson, D. R. , and Pospahala, R. S. (1970), "Gorrection of Bias in Belt Transect Studies of Immotile Objects," Journal of Wildlife Management , 34, 141-146. Barr, A. J., Goodnight, J. H., Sail, J. P., and Helwig, J. T. (1976), A User's Guide to SAS 76 , Raleigh: SAS Institute. Bhat, U. N. (1972), Elements of Applied Stochastic Processes , New York : John Wiley & Sons. Burnham, K. P., and Anderson, D. R. (1976), "Mathematical Models for Nonparametric Inferences from Line Transect Data," Biometrics , 32, 325-336. Grain, B. R. , Burnham, K. P., Anderson, D. R. , and Laake, J. L. (1978) , A Fourier Series Estimator of Population Density for Line Transect Sampling ^ Logan: Utah State University Press . Eberhardt, L. L. (1968), "A Preliminary Appraisal of Line Transects," Journal of VJildlife Management , 32, 82-88. Gates, G. E., Marshall, W. H. , and Olson, D. P. (1968), "Line Transect Method of Estimating Grouse Population Densities," Biometrics , 24, 135-145. Goodman, L. A. (1960), "On the Exact Variance of Products," Journal of the American Statistical Association , 55 , 708-713. Hayne , D. W. (1949), "An Examination of the Strip Gensus Method for Estimating Animal Populations," Journal of Wildlife Management , 13, 145-157. IMSL (1979), The IMSL Library , Seventh ed., Vol. 3, Houston: International Mathematical and Statistical Libraries, Inc. Korn, G. A., and Korn , T. M. (1968), Mathematical Handbook for Scientists and Engineers , Second ed. , New York: McGraw-Hill. 90
PAGE 99
91 Leopold, A. (1933), Game Management , New York: Charles Scrlbner's Sons. Lindgren, B. W. (1968), Statistical Theory , Second ed. , New York: Macmillan. Loftsgaarden, D. 0., and Quesenberry, C. P. (1965), " A Nonparametric Estimate of a Multivariate Density Function," Annals of Mathematical Statistics , 36, 1049-1051. Pielou, E. C. (1969), An Introduction to Mathematical Ecology , New York: John Wiley & Sons. Pollock, K. H. (1978), "A Family of Density Estimators for Line Transect Sampling," Biometrics , 34, 475-478. Robinette, W. L., Jones, D. A. Gashwiler, J. S., and Aldous, C. M. (1954), "Methods for Censusing Winter-Lost Deer," North American Wildlife Conference Transaction s , 19, 511-524. Robinette, W. L. , Loveless, C. M. , and Jones, D. A. (1974), "Field Tests of Strip Census Methods," Journal of Wild life Management , 38, 81-96. Seber, G. A. F. (1973), The Estimation of Animal Abundance and Related Parameters^ London : Griffin. Sen, A. R. , Tournigny, J., and Smith, G. E. J. (1974), "On the Line Transect Sampling Method," Biometrics , 30, 329-340. Smith, M. H. , Gardner, R. H., Gentry, J. B., Kaufman, D. W. , and O'Farrel, M. H. (1975), Small Mammals: Their Pro ductivity and Population Dynamics , International Biological Program. Webb, W. L. (1942), "Notes on a Method of Censusing Snowshoe Hare Populations," Journal of Wi ldlife Management, 6, 67-69.
PAGE 100
BIOGRAPHICAL SKETCH John Anthony Ondrasik was born on August 17, 1951, in New Brunswick, New Jersey. Shortly thereafter his parents moved to Palmerton, Pennsylvania, where he grew up and attended high school. After graduation in June, 1969, he entered Bucknell University in Lewisburg, Pennsylvania, and received the degree of Bachelor of Science with a major in mathematics in June, 1973. It was during his studies at Bucknell that he became interested in statistics through the influence of the late Professor Paul Benson. In September, 1973, he matriculated in the graduate school at the University of Florida and received the degree Master of Statistics in 1975. V/hile pursuing his graduate studies, he worked for the Department of Statistics as an assistant in their biostatistics consulting unit. In November, 1978, he accepted the position of biostatistician with Boehringer Ingelheim, Ltd. John Ondrasik is married to the former Anntoinette M. Lucia. Currently they reside in Danbury , Connecticut. 92
PAGE 101
I certi<^7 that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. c?.\].n^ peiaver V. Rao, Chairinan Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Dennis D. Wackerly Associate Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. ) / / Richard L. Scheaffer Professor of Statistics -^^^ T certify that I have rend this .study i\\m\ that in my opinion it conforms to acceptable standards of scholarly l)resentaLiou and is fully adc(|uaLc, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Ra'mc^n'C. LitU< Associate Pro^-t^ssor of Statistics
PAGE 102
I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. L V^ . iVvcVJayne R. Marion Assistant Professor of Forest Resources and Conservation This dissertation was submitted to the Graduate Faculty of the Department of Statistics in the College of Liberal Arts and Sciences and to the Graduate Council, and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. December 1979 Dean Graduate School
|
|