Many wildlife studies seek to understand changes or differences in the proportion of sites occupied by a species of interest. These studies are hampered by imperfect detection of these species, which can result in some sites appearing to be unoccupied that are actually occupied. Occupancy models solve this problem and produce unbiased estimates of occupancy and related parameters. Required data (detection/non-detection information) are relatively simple and inexpensive to collect. Software is available free of charge to aid investigators in occupancy estimation.
Studies of wildlife populations often attempt to understand patterns of distribution and abundance. Estimating abundance can be a costly endeavor, and other state variables1 like species richness or occupancy2 may be more appropriate and less expensive. Occupancy is an alternative that has a long history of use in ecological and wildlife studies. Two of the most noticeable areas where occupancy information is used include: (1) studies of species distribution and range where investigators seek to understand the factors that determine whether or not a species will exist at a location (e.g., habitat modeling, Scott et al. 2002) and (2) metapopulation dynamics (Hanski 1992) where site (or patch) occupancy is related to patch, or site-specific, characteristics. For the latter case, extinction and colonization probabilities can also be modeled in relation to patch characteristics. Monitoring occupancy can reveal changes in the status of a species over broad areas and may be appropriate for species that exhibit wide population fluctuations over short time periods. For example, occupancy has been the most influential state variable in describing world-wide amphibian declines (Green 1997).
Wildlife species are rarely detected with perfect accuracy, regardless of the technique employed. Non-detection does not necessarily mean that a species was absent unless the probability of detecting the species (detectability3) was 100%. This leads to a fundamental problem: the measure of occupancy (presence/absence at a set of sites) is confounded4 with the detectability of the species. More specifically, an observed “absence” occurs if either the species was present at the site but not detected, or the species was truly absent. Detectability may vary among study sites and may be related to characteristics of a survey on a particular day, such as weather conditions. Because of this variation in detectability, it is insufficient to simply analyze detection/ non-detection data as if they are truly presence/absence data. The proportion of sites where a species is detected will always underestimate the true occupancy level in the study area when detection is imperfect. Therefore, inferences regarding the influences of site characteristics on occupancy will be difficult or impossible to discern reliably (e.g., Gu and Swihart 2004).
New classes of models, called occupancy models, were developed to solve the problems created by imperfect detectability (MacKenzie et al. 2002, 2003, 2004). These models use information from repeated observations at each site to estimate detectability. Detectability may vary with site characteristics (e.g., habitat variables) or survey characteristics (e.g., weather conditions), whereas occupancy relates only to site characteristics. Repeated observations can take many forms, but the most obvious is simply surveying each site repeatedly. n some cases, traps, coverboards, transects, and surveys by independent observers can be treated as repeated observations for a local sample area or site. For example, data from 10 minnow traps in each of 30 ponds could be treated as 10 observations at each pond if there is some possibility that the species of interest could be aught in each of the traps. How Does This Work? The technique is very similar to estimating abundance from markrecapture data but does not require any marking of animals. Necessary information for occupancy models is simply a record of whether a species was detected or not detected during each survey of each site (Box 1). These records, termed detection histories, can be converted to mathematical statements. Assuming the sites are independent, the product of all the mathematical statements (one for each possible detection history) forms the model likelihood5 for the observed data, and maximum likelihood techniques are then used to estimate model parameters6. Parameter estimates (occupancy or detectability) can be related to various site and survey characteristics using the logistic equation or logit-link function7. The details of this and other variations on occupancy estimation are described in a series of journal articles (see MacKenzie et al. papers under Further Reading).
Two software packages will help you do an occupancy analysis. PRESENCE was created exclusively for occupancy analysis and is available at http://www.mbr-pwrc.usgs.gov/software.html. Occupancy analysis has also been incorporated into MARK, which is available at http://www.cnr.colostate.edu/~gwhite/software.html. Examples using occupancy odels include Corn et al. (2005), Olson et al. (2005), and O’Connell et al. (2005).
All models have assumptions, and occupancy models are no exception. Critical assumptions for data collected during a single sampling season include:
1. Occupancy state is “closed.” Species are present at occupied sites for the duration of the sampling season. Occupancy does not change at a site within the sampling season, but it can change between sampling seasons.
2. Sites are independent. Detection of the target species at one site is independent of detecting the species at other sites. This might be a problem if your sites are closely spaced, allowing animals to move among sites and be detected at multiple sites.
3. No unexplained heterogeneity8 in occupancy. Probability of occupancy is the same across sites or differences in occupancy can be explained with site characteristics (covariates) that have been quantified for inclusion in the model.
4. No unexplained heterogeneity in detectability. Detectability at occupied sites is the same across all surveys and sites, or differences in detectability can be explained with site or survey characteristics that have been quantified for inclusion in the model.
Investigators can use designbased or model-based approaches to meet these assumptions. A designbased approach involves collecting data in a way that assures the assumptions will be met. For example, natural history information may aid in scheduling surveys during times when the closure assumption is most likely to hold. In addition, a scientist may use existing movement information to ensure that sample sites are dispersed across the sample area in a manner that maintains independence. Sites should be chosen according to some type of probability-based sampling9 (e.g., simple-random sample, stratified-random sample, etc.) to ensure that estimates of occupancy apply to the area of interest. Sampling may be standardized to try to minimize differences in detection probability caused by variation in environmental conditions or time of day. Unfortunately, it is impossible to control for all possible factors that can affect detectability or occupancy. When design-based approaches cannot reduce all of the variation (sometimes termed heterogeneity) in either occupancy or detectability, model-based approaches might help. Investigators should collect information about factors they believe could cause heterogeneity in either of these two parameters and then incorporate these covariates into the estimation process. For example, it may be impractical and wasteful to only sample amphibian breeding ponds during sunny days, but larvae are more difficult to see on cloudy days; thus, investigators should record weather conditions during each survey. The variation in detection probabilities caused by cloud cover can easily be incorporated to obtain unbiased estimates of occupancy.
If the assumptions mentioned above are not met, estimates of occupancy and detectability can be biased10 and inferences about factors that influence these parameters may be flawed (e.g., Gu and Swihart 2004). If the target species is not present at sites throughout the entire study season, then estimates of occupancy may still be unbiased if the species moves randomly in and out of a sampling unit. Interestingly, the interpretation of occupancy changes to the proportion of sites used by the target species in this case. Likewise the probability of detecting the species at occupied sites is now a combination of two different components: the probability that the species was present at the sampling unit and the probability of detecting the species given it was present. If movement in and out of the sampling unit is not random, the occupancy estimator will likely be biased. For example, non-random movement occurs if the target species was not initially at the site when sampling commenced, then moves into the sample unit and stays for the duration of the season. The direction of the occupancy bias depends on the direction of the movement (see Kendall 1999 for more details and possible solutions).
Little work has been done involving the impact of variation in occupancy probability among sites that cannot be associated with covariates (occupancy heterogeneity). It is possible that the overall average occupancy estimate may still be unbiased, but more research on the impacts of occupancy heterogeneity is needed.
Heterogeneity in detection probability will often result in occupancy estimates that are low (negatively biased). This problem is further exacerbated in studies involving a small number of sites, few repeated surveys at each site, or species with exceptionally low detection probabilities. Anticipating variation and minimizing its effects either through study design or collecting relevant covariates to model it is essential for good performance of these methods.
Finally, if detection is not independent among sites, the standard error estimates are usually too low. In these instances, the number of independent sites is actually smaller than the total number of sites surveyed. There are existing model-based methods that aid in detecting and correcting this problem (MacKenzie and Bailey 2004).
Approximately 10% of the world’s salamander species are found in the southern Appalachian region, with 31 species occurring inside the boundaries of Great Smoky Mountains National Park (GSMNP; Dodd 2003). Despite this rich diversity, large-scale or long-term studies of terrestrial salamanders in this region are almost nonexistent (but see Hairston and Wiley 1993). An area of interest involves the impact of various forms of disturbance on terrestrial salamander populations (Petranka et al. 1993, Ash 1997, Petranka 1999, Ash and Pollock 1999). Some areas within GSMNP incurred heavy human use in the form of logging or settlement prior to the park’s establishment in 1934. In this example analysis, we ask if previous disturbance history affects the probability of occurrence for salamanders of the Desmognathus imitator complex in GSMNP. We caution that this analysis is meant as an example only, and we remind readers that there is always an inherent danger in inferring biological process from spatial patterns.
Data: Salamanders were sampled using two methods: an area-constrained natural-cover transect (50 x 3 m) and a 50 m coverboard transect, consisting of 5 coverboard stations spaced at 10 m intervals. The two transects were parallel to one another and separated by approximately 10 m. Together, the area sampled by these transects constituted a site or sample unit. Sample units were near trails and located approximately 250 m apart to ensure independence among sites. Thirty-nine sites were sampled once every two weeks from April to mid- June when salamanders were believed to be most active and near the surface. However, we detected no salamanders of the Desmognathus imitator complex during the first survey. Thus, we eliminated this survey from the analysis because we assume the salamanders had not emerged from their winter retreats and were unavailable for capture during this survey occasion. This left a total of four surveys for the analysis.
Analysis, Model Selection, and Interpretation: Salamanders of the Desmognathus imitator complex were detected at 10 of the 39 sites, yielding a naďve occupancy estimate of 0.26; however, we suspected that salamanders may be more likely to occupy undisturbed sites compared to disturbed sites. In addition, we thought detectability might vary among surveys due to environmental conditions such as rainfall or temperature. Thus, we consider all combinations of models in which occupancy probability is assumed to be constant for all sites (denoted as ? (.)) or varied among sites according to the site’s previous disturbance history (? (dist)); detection probability was either constant ( p (.)), different among surveys (p (t)), or varied among sites according to previous disturbance history (p (dist)). Models that fit the data best with the least number of parameters are favored. We do not have room to explain the details of this parsimonious process of model selection11, but Burnham and Anderson (2002) have written a comprehensive book on the subject. Using the software PRESENCE and methods they describe, our analysis highlighted four models as the best models representing our salamander data (see Box 2). There is some uncertainly as to which model is the best, so our parameter estimates are essentially a weighted average12 among all four models. Together these models suggest that occupancy indeed differs between previously disturbed and undisturbed sites; all of the top models include previous disturbance history as a covariate in the occupancy estimate. Model-averaged occupancy estimates were 0.19 (SE = 0.15) and 0.70 (SE = 0.15) for previously disturbed and undisturbed sites, respectively. Detectability also varied among surveys and possibly among sites with different disturbance histories (Figure 1, Box 2).