<< . .

. 17
( : 35)

. . >>

regression line (Y = 13.048 + 0.0059X) and is signi¬cant (r = 0.89, p < 0.01). The year the
sample was collected is indicated. Data from Table 4.2.

that those between NISP and NTAXA. The NISP of deer varies by as much as thirteen
NISP per cubic meter across the six samples from the Meier site (Table 4.1 ). There
is no statistically signi¬cant relationship between the volume excavated per year and
the NISP per annual sample for the 1987“1991 samples (r = 0.05, p > 0.2); the small-
est (1973) sample is deleted because it in¬‚uences the result considerably (r = 0.73 if
that sample is included). NISP per annual sample and richness per annual sample at
the Meier site, on the other hand, are strongly correlated (Figure 4.4). This suggests
that the better variable with which to monitor the in¬‚uence of sample size on variables
such as NTAXA is NISP, though this may vary.
Were I to examine the adequacy of the sample from Cathlapotle in a real analysis
rather than simply illustrating an analytical technique, I would apply the sampling
to redundancy protocol to the precontact assemblage and also to the postcontact
assemblage rather than to the site collection as a whole. This protocol demands that
the target variable of interest be explicitly de¬ned and the boundaries of the appro-
priate sample be unambiguous. Nevertheless, several lessons can be taken from the
preceding. First, the absolute size of a sample is not necessarily a good measure of
that sample™s representativeness of a particular target variable. The total mammalian
sampling, recovery, and sample size 151

genera NISP from Cathlapotle (= 6,937) is larger than the total mammalian gen-
era NISP from Meier (= 6,421), yet the latter seems to be representative of NTAXA
whereas the former does not seem to be representative of NTAXA. Second, cumu-
lative chronological samples, whether by week, month, or year, provide logical units
with an inherent cumulative order that may provide an indication of when enough
material has been collected. Such an argument presumes that identi¬cation of the
recovered faunal remains proceeds apace with recovery, or that the time lag between
the two is insigni¬cant. If identi¬cation can keep pace with recovery, then pale-
obiological resources can be saved in situ rather than disturbed (some would say
“destroyed”) by recovery because it will be clear when a sample suf¬ciently large to
provide an accurate answer (one not in¬‚uenced by inadequate sample size) has been
The ¬nal lesson is that, presuming one knows the identity of the variables plotted
on both axes (and there is no reason the analyst should not), the meaning of the
cumulative curve is commonsensical. When the curve is steep, much new information
is being added with each new sample; when it is horizontal, new samples are adding no
new information about the target variable. This makes such a curve useful as a simple
(and readily visible) way to monitor what is being learned as sample size increases, and
to determine whether additional samples are necessary or not. Samples can comprise
the material collected during one temporal period, or they can be structured some
other way, such as choosing (using probability sampling) a sample of 10 percent of all
units excavated or exposures inspected (or collected), then another 10 percent, then
another, and so on. Similarly, any target variable that consists of a single value can
be plotted on the y-axis, just as any kind of sample can be plotted on the x-axis. Such
target variables might involve the average or mean size of individuals of a taxon, or
the frequencies of skeletal parts of a taxon, or virtually any variable.
There is more to say about the type of graph shown in Figures 4.1 “4.4. This
graph type is one in which a measure of sample size is plotted against a measure of a
biological property, and more is said about such graphs and different versions of them
later. The sampling to redundancy type of graph is introduced here to illustrate its
value for evaluating sample adequacy as sample size is being actively increased. Once
the ¬eldwork is completed, little else can be done to increase the size of a collection.
Knowing whether more specimens are needed as collection is taking place would be
valuable knowledge. Knowing whether one is losing material rather than collecting
it as sediment is inspected (e.g., screened) would be equally valuable knowledge.
Zooarchaeologists in particular have devoted a great deal of energy to ¬guring out
how to generate this latter sort of knowledge, and it is to the results of those energy
expenditures that we turn to next.
quantitative paleozoology


Once a geographic and geological context in which to look for faunal remains has
been chosen, the next step is to choose how those remains will be searched for
and retrieved from sediments. Faunal remains can be hand picked from sediments as
those sediments are excavated. Bones and teeth can be collected from screens or sieves
the function of which is to allow sediment to pass through whereas faunal remains
are caught in the mesh where they are more visible than when in the sediment in
the excavation. Screens were not always used by zooarchaeologists who gathered, by
hand, those bones and teeth they saw in the sediment as it was excavated. Watson
(1972) showed that many bone fragments ¤ 3 cm maximum dimension tended to be
overlooked when hand picking alone was used. Passing sediments through screens
increased the return of small fragments an order of magnitude. An earlier study
showed exactly the same thing using remains of mollusks.

Hand Picking Specimens by Eye

Sparks (1961 ) demonstrated that the percentage of recovered remains of terrestrial
mollusks differed markedly by size class. The sample collected by eye, unaided by
screens, tended to have more specimens representing large size classes (>50 percent
of all specimens recovered) whereas the sample collected from a screen was dom-
inated by specimens representing small size classes (>70 percent of all specimens
recovered). His data are graphed in a way different than Sparks did in Figure 4.5 to
allow comparison of the taxonomic abundances in the two samples. The identity of
the taxa themselves is unimportant to this exercise, so categories of specimens are
distinguished on the basis of ordinal scale average size. Thus, whereas Sparks (1961)
distinguished eighteen taxa, there are only ¬fteen size classes here. Figure 4.5 shows
that the taxa with the largest shells were those that were most often collected by hand,
and those taxa with the smallest shells were seldom collected by hand. What seems
to have been a 2 mm mesh sieve produced many more individuals of small size, and
Sparks (1961 :72) concluded in an understated way that “Any attempt to pick out shells
by eye from a deposit is bound to lead to distortion in the percentage frequencies of
species.” Study by invertebrate paleobiologists of what is now referred to as size bias
continues (e.g., Cooper et al. 2006; Kowalewski and Hoffmeister 2003).
Results like those Sparks (1961 ) derived for mollusks were found by Payne (1972,
1975) for mammal remains. Although he did not explicitly list the body size or average
live weight of an adult animal, Payne (1975) found that more of the larger remains
sampling, recovery, and sample size 153

figure 4.5. Relative abundances of ¬fteen size classes of mollusk shells recovered during
hand picking from the excavation, and recovered from ¬ne-mesh sieves. Original data from
Sparks (1961 ).

of large-bodied taxa were found by hand while excavating whereas more of the small
remains of small-bodied taxa were found in sieves or screens. Taxa were rank ordered
in ¬ve size classes, from largest to smallest: cattle (Bos sp.), pig (Sus sp.), sheep and goat
(Ovis sp., Capra sp.), canid (Canis sp., Vulpes sp.), and hares (Lepus sp.). Assuming
that the remains that were recovered by hand picking from the excavation would
also be recovered from the screen. Figure 4.6 shows two things about Payne™s (1975)


figure 4.6. The effect of passing sediment through screens or sieves on recovery of mam-
mal remains relative to hand picking specimens from an excavation unit. Numbers within
bars are NISP. Data from Payne (1975).
quantitative paleozoology

data. First, more remains in general are collected from screens than by hand from an
excavation, something not so obvious given how Sparks (1961 ) presented his data.
Second, the smaller the body size of a taxon, the more of its remains will be found
in the screen than in the excavation; this echoed Sparks™s original observation on
mollusk remains but expanded it to include remains of mammals.

Screen Mesh Size

It is commonsensical to believe that small bones and small fragments thereof will fall
through coarse-mesh hardware cloth (that with large holes) whereas many will be
caught by and thus be recovered if ¬ne-mesh hardware cloth is used. Thomas (1969)
and Payne (1972, 1975) demonstrated this empirically (see also Casteel 1972; Clason
and Prummel 1977), and showed that the magnitude of loss when coarse mesh was
used had been underestimated. Their seminal work spawned over the next 30 years
a plethora of studies on the in¬‚uence of screen-mesh size on recovery (see James
[1997] for a relatively complete listing of references as of a decade ago). Such studies
continue to this day (e.g., Nagaoka 2005b; Partlow 2006), sometimes with much
more statistical sophistication than that found in the original studies (e.g., Cannon
1999). Although the lessons learned have been signi¬cant ones, many of them were
learned with Thomas™s (1969) seminal effort. For that reason, various analysts have
subsequently used his data to substantiate arguments concerning the in¬‚uence of
screen-mesh size on recovery (e.g., Casteel 1972; Grayson 1984).
Thomas (1969) used zooarchaeological data from three sites; this demonstrated
that recovery was not simply a function of the particular sample (geographic and
geological location) chosen. As each site was excavated, sediment was passed through
a series of nested screens with increasingly ¬ner mesh. The ¬rst screen was 1 /4-inch
(6.4 mm) mesh, the second was 1 /8-inch (3.2 mm) mesh, and the ¬nal screen was 1 /16-
inch (1.6 mm) mesh. All faunal remains in each screen were retrieved and recorded as
to screen mesh in which they were found. After all remains were identi¬ed, Thomas
categorized the remains as to average adult live weight of an individual of the taxon
represented. He distinguished ¬ve size classes: Class I: live weight < 100 g (e.g., mice);
Class II: live weight 100 to 700 g (e.g., squirrels); Class III: live weight 700 g to 5 kg (e.g.,
rabbits); Class IV: live weight 5 to 25 kg (mid-size mammals); and Class V: live weight
> 25 kg (e.g., deer). Thomas retained distinctions between site-speci¬c samples, and
also those between each vertical analytical level within each site. Such distinctions
are irrelevant to studies of the loss of faunal remains, so we can ignore them and
lump all data into categories de¬ned by screen-mesh size and body size (Table 4.4).
sampling, recovery, and sample size 155

Table 4.4. Mammalian NISP per screen-mesh size class and body-size class for
three sites. Percentages are calculated for each body-size class. Data from
Thomas (1969)

Body-size class 1 /4 inch (%) 1 /8 inch (%) 1 /16 inch (%) Total
I (< 100 gm) 141 (5) 910 (31) 1,930 (64) 2,981
II (100“700 gm) 626 (14) 1,478 (33) 2,450 (53) 4,554
III (0.7“5 kg) 1,069 (29) 1,358 (37) 1,275 (34) 3,702
IV (5“25 kg) 85 (96) 4 (4) 0 89
V (> 25 kg) 1,308 (100) 1 (0.1) 0 1,309
Total 3,229 3,751 5,655 12,635

Understating the issue, Grayson (1984:170) noted that there are “a number of ways
in which [Thomas™s] recovery data can be analyzed, but no matter how the analysis
proceeds, the effects of screen-mesh size on recovery are dramatic.” Although it
doubtless is untrue, assume, for instance, that 100 percent of all faunal remains were
recovered by the 1 /16-inch screen mesh. We can then determine the cumulative
percentage of NISP of each body-size class of mammal that was recovered across
the increasingly ¬ner screen-mesh size classes. These cumulative percentages are all
plotted in Figure 4.7. That ¬gure shows that the larger the body-size class is, the
more of a taxon™s remains are recovered in coarse mesh screens, and the smaller the
body-size class, the more of a taxon™s remains are recovered in ¬ne-mesh screens.
Thomas™s data empirically demonstrated what had long been suspected prior to his
study “ remains of small organisms are lost through coarse-mesh screens “ and they
demonstrate it with remarkable clarity. They demonstrate it on at least an ordinal
scale because screen-mesh size classes and body-size classes are treated in Figure 4.7
as ordinal-scale variables. The one thing that we do not know from Thomas™s data is
the nature of what is lost through the 1 /16-inch mesh screens. But even without such
information, Thomas™s data should prompt us to worry about taxonomic abundance
data even if we use a ¬ne-mesh hardware cloth, such as 1 /8-inch or 1 /16-inch mesh.
Small taxa will be underrepresented relative to large taxa even when ¬ne-mesh sieves
are used. Deciding how thorough to be in recovery efforts (¬ner mesh will result in
greater thoroughness) is a tactical decision that will depend on the research question
asked and its attendant target variables.
Even though numerous empirical studies indicate that the coarser the screen mesh,
the more small specimens pass through the sieve and are not recovered, occasionally
this does not seem to hold true (e.g., Vale and Gargett 2002). The potential reasons
for this are several (Gobalet 2005; Zohar and Belmaker 2005), but the most likely ones
quantitative paleozoology

figure 4.7. Cumulative percentage recovery of remains of different size classes (Roman
numerals) of mammals. The critical but empirically unvalidated assumption is that all
remains will be caught in the 1 /16-inch mesh screen. Data from Table 4.4 (originally from
Thomas 1969).

are taphonomic (Gargett and Vale 2005). If small remains are taxonomically uniden-
ti¬able because they are anatomically incomplete due to fragmentation, corrosion,
or some other taphonomic process, then it is possible that the use of small sieves
will not increase the value of NTAXA (e.g., Cooper et al. 2006). This is an empirical
matter; every collection is unique and subject to investigation as to whether or not
¬ne mesh makes a difference.

To Correct or Not to Correct for Differential Loss

If one passes site sediments through coarse-mesh hardware cloth, it is likely that
small bones and small teeth will, like the sedimentary particles themselves, pass
through the screen and thus not be recovered. The coarser the mesh of the hardware
cloth “ the larger the openings “ the more remains of, ¬rst, small animals, and then
progressively larger animals, as coarseness increases, will be lost because they are able
to pass through the hardware cloth. The total magnitude of such loss will depend
on the population of remains of small animals present in the screened sediments
(Clason and Prummel 1977). The choice of sieve mesh size should depend on the
sampling, recovery, and sample size 157

research questions one is asking because using ¬ner mesh means it will take longer
(and cost more) to complete an excavation “ there will be more material caught in the
screen that must be looked over and from which faunal remains must be removed.
One way to avoid the total cost of using ¬ne-mesh sieves throughout an excavation
is to take bulk samples every so often (how often is a matter of choice within the
sampling design used) and to pass those bulk samples through one or more ¬ner
meshed sieves to determine what and how much is being lost. Some analysts have
argued that if the rate of loss can be determined, then what has been recovered
can be mathematically adjusted to account for what has been lost (e.g., James 1997;
Thomas 1969; Ziegler 1965, 1973). Because differential recovery is often a troublesome
concern, it is worthwhile to review one way to correct for differential loss.
Thomas (1969) suggested that the analyst determine a correction factor to analyt-
ically compensate for differential recovery of small remains. This might involve ¬rst
using a formula like this:

Percentage of NISP lost = 100 NISP from ¬ne-mesh or bulk samples /

NISP from ¬ne-mesh or bulk samples
+ NISP from coarse mesh or standard recovery

Once the percentage lost is known, the inverse of the fraction lost (represented by
the percentage lost) can be multiplied by what has been recovered to estimate what
would have been recovered if there had been no loss. Alternatively, Thomas (1969)
suggests simply calculating the recovery ratio using the formula:

Recovery ratio = NISP for all recovery methods/
NISP for recovery method of interest.

This formula is used for each size class of taxa. Thus, using the data in Table 4.4
for illustrative purposes, the recovery ratios per size class are: I: 21.14 (2981 /141); II:
7.27 (4554/626); III: 3.46 (3702/1069); IV: 1.05 (89/85); and V: 1.00 (1309/1308). This
means that if one wanted to correct for differential recovery that resulted from use of
different screen mesh sizes at these sites, then the NISP of size class I remains should
be multiplied by 21.14, the NISP of size class II remains should be multiplied by 7.27,
size class II by 3.46, size class IV by 1.05, and size class V by 1.
There is a critical assumption that must be granted if a correction protocol such
as that described by Thomas is to be used. The assumption is that the rate of loss
determined from the subsample is representative of the entire sample. The weakness
quantitative paleozoology

of the assumption is that the recovery rate will likely vary from recovery context
to recovery context because faunal remains tend to not be randomly distributed
throughout a site or throughout a stratum. Loss will not be stable but in fact will likely
vary not only from site to site and from stratum to stratum, but also from horizontal
context to horizontal context within a site or stratum. Few researchers have explored
this potentiality of a nonhomogeneous distribution of faunal remains with real data.
Thomas (1969) used statistical procedures to determine that there seemed to be
minimal vertical variation in the distributions of faunal remains, and so had an
empirical warrant to apply his correction factor across entire site collections.
Not all sites have homogeneous distributions of faunal remains, and thus it is
ill advised to calculate a correction factor based on one excavation unit (whether
horizontally distinct, vertically distinct, or both) and to then apply that correction
to another unit to obtain, say, a site-wide value (e.g., Cannon 1999; Lyman 1992a;
Shaffer and Baker 1999). Occasionally paleozoologists have noted the proportion of a
deposit that has been excavated, and then estimated frequencies of taxa in the entire
site or deposit (e.g., Lorrain 1968). Again, such an estimation procedure assumes that
the density of NISP per unit of area or unit of volume observed applies to the entire
site or deposit under study. As data presented by Cannon (1999) demonstrate, such
an assumption should be empirically validated, else estimates of total site content
will be in error.


Thus far several issues with respect to generating collections of faunal remains have
been touched on. The focus has been to describe how one might determine if a
collection is representative of a target variable by determining if one has sampled to
redundancy or not, to illustrate how a particular recovery technique might in¬‚uence
what is collected (hand picking and screen mesh size), and to argue that despite
being able to calculate a recovery rate in a mathematically elegant fashion, to utilize
that rate as a correction factor is unwise given the requisite assumption that faunal
remains are homogeneously distributed over the sampled deposit(s). For the sake of
simplicity, throughout the chapter the focus has been on samples from which one
seeks to measure taxonomic richness, or NTAXA. But the arguments hold with equal
force for taxonomic abundances and other quantitative measures of the taxonomic
composition of a collection, as demonstrated in Chapter 5.
The arguments made here also hold for nontaxonomic quantitative measures. For
example, if the remains of taxa comprised of small individuals are lost more often
sampling, recovery, and sample size 159

than the remains of taxa comprised of large individuals (Shaffer 1992), then it stands
to reason that such intertaxonomic variation in recovery likely also applies intratax-
onomically. In particular, small skeletal elements of a taxon will be lost more often
than large skeletal elements (e.g., Nagaoka 2005b). Similarly, small fragments will be
lost more often than large fragments (Cannon 1999). In general, small specimens will
be lost more often than large specimens, regardless of the taxonomy or anatomical
completeness of those specimens. The general lessons from such observations are two.
The ¬rst lesson rests on the fact that a relationship between sample size and the
variable of interest may exist, so paleozoologists should search for such relationships
(e.g., Koch 1987). If a relationship is found, then although the sample might in fact
be representative of the variable of interest, the observed value of that variable might
be result of sample size (Leonard 1997). Until such possible sample-size effects are
controlled for analytically, or the relationship is found to be merely a correlation and
not causal, it is ill-advised to interpret the variable in terms of some ecological or
anthropogenic factor. The second lesson is that virtually any conceivable quantitative
variable that can correlate with NISP will display values that are also potentially a
function of sample size. Finding correlations between target variables and sample
sizes does not preclude analysis and interpretation, but such ¬ndings suggest that
cautious interpretation is warranted if the sample-size effects cannot be analytically
controlled or eliminated. This brings up the important topic of how we might detect
sample-size effects and how we might control for them.


Botanists recognized in the early twentieth century that the larger the area they sam-
pled the more species of plant they identi¬ed (Leonard 1989). Initially the relationship
was thought to be linear “ that as the area sampled increased, the number of species
would increase at a constant rate. Within a decade or two it was empirically demon-
strated that the relationship was semilogarithmic when large areas were considered.
The number of species identi¬ed increased as the logarithm of the area increased. By
the late 1930s, the relationship between amount of area sampled and number of plant
species identi¬ed was being graphed as shown in Figure 4.8 (after Cain 1938). Within
a few years, a graph of like form was generated for animal taxa but instead of the
area sampled the independent variable was the total number of individual animals
tallied (Fisher et al. 1943). The relationship between area examined and the num-
ber of taxa identi¬ed (NTAXA), and that between number of individuals tallied and
NTAXA are the same because the more area examined the more individuals (whether
quantitative paleozoology

Number of Taxa

Area Sampled (square meters, hectares, etc.)
figure 4.8. Model of the relationship between area sampled (or sampling intensity) and
number of taxa identi¬ed.

plants or animals) are encountered. In ecology, graphs with the form of Figure 4.8
are sometimes referred to as accumulation curves. They are more often referred to as
“species“area curves” because of the seminal discovery of the relationship between
these two variables.
Given the nature of the relationship between the two variables, ecologists in the
middle of the twentieth century became concerned with determination of how much
area to sample, or how many individuals to tally, to ensure that their samples were
representative of the target variable (often a habitat or biological community of
some scale). One solution was to hold the area sampled constant at some minimum
size thought to be adequate. Another is an analytical procedure termed “rarefac-
tion” (Sanders 1968). Rarefaction involves determination of the number of species
expected if all samples were the same size (if all samples included the same number
of individuals). Richness or NTAXA for a fraction of a collection can be estimated by
drawing a (random) subsample (equal to the fraction) of a sample (equal to the col-

<< . .

. 17
( : 35)

. . >>