figure 5.4. Relationships between NISP and NTAXA of small mammals per stratum

(Roman numerals) at Homestead Cave, Utah (after Grayson 1998). Dashed, best-¬t regres-

sion line, r = 0.88, p = 0.3; solid, best-¬t regression line, r = 0.92, p = 0.0001. Faunal material

from strata X and XIII“XVI has not been studied. Data from Table 5.2.

measuring the taxonomic structure and composition 183

Table 5.2. NISP and NTAXA for small

mammals at Homestead Cave, Utah. Data

from Grayson (1998). Faunal remains from

strata X and XIII“XVI were not studied

Stratum NISP NTAXA

XVIII 1047 9

XVII 15,421 17

XII 22,661 14

XI 9,996 14

IX 18,043 16

VIII 8,215 13

VII 11,038 15

VI 18,661 17

V 5,093 13

IV 26,200 19

III 2,774 17

II 7,756 20

I 9,906 19

taxa were accumulated and deposited when it was arid. Despite apparent sample size

effects, the mammal assemblages from Homestead Cave appear just as they should

in terms of the relationship between NTAXA and climate.

Another example of studying the covariation of sample size and taxonomic rich-

ness comes from the Upper Paleolithic rockshelter of Le Flageolet I, France (Grayson

and Delpech 1998). The ungulate remains at this site were largely introduced by

humans, but interestingly, there is yet another pair of relationships between NTAXA

(of ungulates) and NISP (of ungulates) (Table 5.3). There is no patterned rela-

tionship between the associated archaeological culture and which line a particular

assemblage of ungulate remains helps de¬ne (Figure 5.5). The analysts found no clear

indication that the degree of fragmentation was creating the two relationships, and

no indication that the differential transport of skeletal parts by bone accumulators

had created the two relationships (Grayson and Delpech 1998). They concluded that

the difference involved variation in diet breadth, or the width of the niche exploited

by the humans that created the assemblages.

There are other ways to compare taxonomic richness values of assemblages of

different sizes. Recall from Chapter 4, for example, that the original recognition

of the sample-size effect (the species’area relationship) was based on the amount

quantitative paleozoology

184

Table 5.3. NISP and NTAXA for

ungulates at Le Flageolet I, France. Data

from Grayson and Delpech (1998). Strata

with NISP < 30 are not included

Stratum NISP NTAXA

XI 651 6

IX 681 11

VIII 461 9

VII 1,768 10

VI 376 8

V 1,244 7

IV 145 5

of geographic area sampled. Thus, one might compare taxonomic richness with

the amount excavated, either the area or volume excavated. Wolff (1975) showed

long ago that the greater the volume of sediment searched for faunal remains, the

greater the number of taxa found (see Chapter 4). Taxonomic richness increases

as the amount of sediment examined increases because as the amount of sediment

examined increases, NISP increases (more specimens are recovered, so more taxa

figure 5.5. Relationship between NISP and NTAXA per stratum (Roman numerals) at

Le Flageolet I, France (after Grayson and Delpech 1998). Dashed, best-¬t regression line

r = 0.99, p = 0.06; solid, best-¬t regression line, r = 0.98, p = 0.02. Some strata omitted as

sample sizes are too small for inclusion. Cultural associations for each stratum are indicated.

Data from Table 5.3.

measuring the taxonomic structure and composition 185

are identi¬ed). As the amount of sediment examined increases, the amount of area

examined increases, which brings us back to the original species“area relationship

discovered by botanists.

Regardless of the technique used to gain insight to the structure and composition of

a fauna, taxonomic richness is often strongly correlated with sample size. Therefore,

the analyst must be ever on the alert for differences in sample size measured as

NISP as a variable that potentially contributes to differences in NTAXA. The

analyst should also realize that the possible in¬‚uence of sample size on all measures

of taxonomic diversity (structure and composition) might be disputed, such as when

all of a site deposit (all of a single stratum, or all of a site within horizontal and vertical

boundaries) has been excavated. In such cases one might argue that a 100-percent

sample has been collected and that taxonomic richness cannot be considered to be

a function of sample size. This is in some senses true, but it also overlooks two

fundamental issues. First, a very small number of sites (or strata within sites other

than trivial cases such as the ¬ll of a single intrusive pit) have been totally excavated

(meaning that a 100-percent sample has been generated). Second, even if a site is

completely excavated, it is likely to be but a portion of some larger cultural system than

is found in that single site or only a portion of the taphocoenose, thanatocoenose,

or biocoenose. This brings us back to Kintigh™s (1984) fundamental dilemma of

de¬ning the population one wishes to model with available samples. Only when that

population is de¬ned beforehand will we know if we have a 100-percent sample or

not, and even then preservation variation and recovery procedures may result in less

than complete retrieval (Chapter 4).

Taxonomic Composition

Two faunas can have the same NTAXA, but share anywhere from none to all of

the taxa represented. How do we compare faunas in terms of the taxa they hold in

common and those taxa that are unique to one or the other? How do we determine if

two faunas are similar in taxonomic composition, and how do we determine if fauna

A and fauna B are more similar to one another than either is to fauna C? Indices

have been designed to answer these questions and to measure just these features

(see reviews in Cheetham and Hazel 1969; Henderson and Heron 1977; Janson and

Vegelius 1981 ; Raup and Crick 1979). For unclear reasons these indices have seldom

been used by zooarchaeologists (Styles [1981 ] is a noteworthy exception). It could

be a result of benign neglect, or it could be that the in¬‚uences of varying sample

size are a concern. Before considering the latter issue, however, let™s consider some

quantitative paleozoology

186

exemplary indices. These are sometimes referred to as binary coef¬cients, because

they summarize and compare presence“absence (nominal scale) data.

One index is the Jaccard index (J) [originally, “coef¬cient of ¬‚oral community”].

It is calculated as

J = 100C/(A + B ’ C ),

where A is the total number of taxa in fauna A, B is the total number of taxa in

fauna B, and C is the number of taxa common to both A and B. Another index is the

Sorenson index (S), calculated as

S = 100(2C )/(A + B),

where the variables are as de¬ned for the Jaccard index. Given how they are calculated,

the Jaccard index emphasizes differences in two faunas, and the Sorenson index

emphasizes similarities. For example, if A = 6, B = 6, and C = 4, then J = 50whereas S =

66.7. Comparing the Meier site mammalian fauna (A) with the complete Cathlapotle

mammalian fauna (B), A = 26, B = 25, and C = 20. Thus, for these two faunas, J = 64.5

and S = 78.4. Given that the two faunas fall within the same time period, are < 10 km

apart, and occur in virtually identical habitats, it may seem that the indices of faunal

similarity should be considerably higher. This is so because, at least with respect to

statistical precision sampling (as opposed to discovery sampling; see Chapter 4), at

least the Meier site sample seems to be representative because signi¬cant increases

in its size over the last several years of excavation failed to produce any previously

unidenti¬ed taxa (Lyman and Ames 2004).

Why the Meier and Cathlapotle assemblages are not more similar and do not share

more mammalian genera is an ultimate question. It may have to do with variation in

which taxa were accumulated despite similarity of the agents of accumulation; at both

sites humans were the most signi¬cant accumulation agent. Or, it may actually have

to do with a fundamental problem of all such binary coef¬cients (Raup and Crick

1979). That problem can be illustrated with a pair of Venn diagrams (Figure 5.6).

Each of these has been drawn with the Meier and Cathlapotle collections in mind. A

total of thirty-one genera are represented by the two collections. One Venn diagram

suggests each collection is a sample of those thirty-one genera. The other Venn

diagram indicates that each collection is a sample of the total forty mammalian

genera (excluding eight genera of bats) that occur in the area today (Johnson and

Cassidy 1997). Given that neither zooarchaeological collection has signi¬cantly more

than two-thirds of those genera, it is perhaps not surprising that the two do not share

more taxa. Each site collection represents but a sample of the local biotic community.

Another way to make the point of the preceding paragraph is this: Based on earlier

discussions, it should be obvious that sample size (= NISP) will in¬‚uence binary

measuring the taxonomic structure and composition 187

figure 5.6. Two Venn diagrams based on the Meier site and Cathlapotle site collections.

Upper diagram suggests each collection is a sample of the thirty-one genera represented by

the combined collections. Lower diagram indicates that each collection is a sample of the

total forty mammalian genera that occur in the area.

coef¬cients such as the Jaccard and Sorenson indices. This has been known for

decades; the more individuals, the taxonomically richer the sample, so when Paul

Jaccard proposed his index he suggested areas of similar size be sampled, but he

should have suggested similar numbers of individuals be inspected (Williams 1949).

Consider the fact that both the Meier and Cathlapotle collections are samples, and

thus even if remains of all forty mammalian genera known in the area today had

been accumulated and deposited in site deposits, it is likely that remains of rarely

represented taxa would not be recovered. If more of each of those sites had been

excavated, and several thousand more NISP had been recovered from each site, it

is probable that several of those as yet unidenti¬ed genera would occur in those

collections. This would not only represent a shift in the sampling design toward a

discovery model, but it would also increase the magnitude of both the Jaccard index

and the Sorenson index.

Neither the Sorenson index nor the Jaccard index takes advantage of the abundance

of taxa. A simple way to assess the similarity of taxonomic abundances of two faunas

is to calculate a χ 2 statistic (e.g., Broughton et al. 2006; Grayson 1991b; Grayson and

Delpech 1994). To illustrate this, the NISP data for the collection of faunal remains

from eighty-four owl pellets (Table 2.9) is summarized as two chronologically distinct

quantitative paleozoology

188

Table 5.4. NISP per taxon in two chronologically distinct

samples of eighty-four owl pellets

Taxon 1999 sample 2000“2001 sample

Sylvilagus 5 0

Reithrodontomys 0 19

Sorex 40 6

Thomomys 52 16

Microtus 302 403

Peromyscus 1,147 119

samples in Table 5.4. Chronological distinction concerns when the pellets were col-

lected. χ 2 analysis indicates the two samples differ signi¬cantly in terms of taxonomic

abundances (χ 2 = 586.68, p < 0.0001). The two sets of taxonomic abundances are not

correlated (Spearman™s ρ = 0.6, p > 0.2), which also suggests they may have derived

from different populations, but do the abundances of all of the taxa differ signi¬cantly

between the two samples, or the abundances of just a few of the taxa? To answer this

question, adjusted residuals for each cell were calculated (there are six taxa, and 2

years for each, so twelve cells) to determine if any of the observed values were greater,

or less than would be expected were the two temporally distinct samples derived from

different populations. Basically, the adjusted residual provides a way to determine if

the observed and expected values per cell are statistically signi¬cantly different or not

(see Everitt 1977 for discussion of the statistical method). Expected values (compare

with Table 5.4) and interpretations for each cell are given in Table 5.5. Abundances of

four taxa are causing the statistically signi¬cant difference between the two samples;

specimens of Reithrodontomys, Sorex, Microtus, and Peromyscus are not randomly

distributed between the two chronologically distinct samples. Only Sylvilagus and

Table 5.5. Expected values (E) and interpretation (I) of taxonomic abundances in two

temporally distinct assemblages of owl pellets. See Table 5.4 for observed values

Taxon 1999 E 2000’2001 E 1999 I 2000’2001 I

Sylvilagus 3.7 1.3 p > 0.05 p > 0.05

Reithrodontomys 13.9 5.1 p < 0.05, too few p < 0.05, too many

Sorex 33.7 12.3 p < 0.05, too many p < 0.05, too few

Thomomys 49.8 18.2 p > 0.05 p > 0.05

Microtus 516.8 188.2 p < 0.05, too few p < 0.05, too many

Peromyscus 928.0 338.0 p < 0.05, too many p < 0.05, too few

measuring the taxonomic structure and composition 189

Thomomys occur in the two samples in abundances that are not unexpected; abun-

dances of these two taxa suggest the temporally distinct samples were drawn from

the same population.

Some research has suggested that the Sorenson index provides a better estimate

of similarity than Jaccard™s index (Magurran 1988:96). Not surprisingly, ecologists

designed a version of Sorenson™s index to take account of variation in taxonomic

abundances. That index, Sorenson™s quantitative index, is calculated as

Sq = 2 c N /(AN + B N),

where AN is the total frequency of organisms (all taxa summed) in fauna A, BN is

the total frequency of organisms in fauna B, and cN is the sum of the lesser of the

two abundances of taxa shared by the two assemblages. Using the data in Tables 4.2

and 4.3 for Meier and Cathlapotle mammalian genera, AN (Meier) = 6421, BN

(Cathlapotle) = 6,937, and cN = 4,358 (3 Scapanus + 7 Aplodontia + 342 Castor + 5 Peromyscus +

68 Microtus + 106Ondatra + 39Canis + 5 Vulpes + 102 Ursus + 207 Procyon + 2 Martes + 29Mustela +

3 Mephitis + 51 Lutra + 9Felis + 26Lynx + 43 Phoca + 935 Cervus + 2376Odocoileus ). Thus, Sq =

2(4358)/(6421 + 6973) = 8716/13,394 = 0.651, or 65.1. Recall that the (nonquantitative)

Sorenson™s index was 78.4. Thus, regardless of which index of similarity is used, the

faunas seem fairly similar, though less so when the abundances of taxa are included

than when they are ignored.

A simple way to show similarities and differences between two faunas in terms

of shared taxa, unique taxa, and taxonomic abundances, is to generate a bivariate

scatterplot. Figure 5.7 shows relative (percentage) abundances of those taxa from

Meier and Cathlapotle represented by NISP < 200 at both sites. Notice that were the

relative abundances of taxa equivalent at the two sites, the points would fall close to

the diagonal line; the more equal the relative abundances, the closer to the diagonal

the points would fall. Note as well that more of the points fall on the Meier side

of the diagonal. This suggests that those taxa are relatively more abundant at Meier

than they are at Cathlapotle. Such a graph takes advantage of abundance data in a

visual way. Ecologists are working to develop versions of the Jaccard and Sorensen

indices that also take advantage of abundance data (e.g., Chao et al. 2005), but these

are beyond the scope of the discussion here.

That the binary coef¬cients designed to measure taxonomic similarities of faunal

collections have been largely ignored by zooarchaeologists is likely a good thing. Those

coef¬cients are heavily in¬‚uenced by the sample sizes (= NISP) of the compared

collections because taxonomic richness is signi¬cantly in¬‚uenced by sample size

(Chao et al. 2005). Again, one might use rarefaction in an effort to control sample-

size effects, and that is what some paleozoologists have done (e.g., Barnosky et al. 2005;

quantitative paleozoology

190

figure 5.7. Bivariate scatterplot of relative (percentage) abundances of mammalian genera

at the Meier site and Cathlapotle. Only genera for which NISP < 200 are plotted. Diagonal

line is shown for reference.

Byrd 1997). This could be a good thing, but it is perhaps unwise for the simple reason

that as was noted more than 20 years ago, the rarefaction procedure was designed

to be used with quantitative units that are statistically independent of one another

(Grayson 1984:152). Those units are also ratio scale values. NISP tallies comprise

units that are probably statistically interdependent and that are also typically at best

ordinal scale values. Given these facts, should one choose to perform a rarefaction

analysis using NISP values, the results should be interpreted in at most ordinal scale

terms. An example will make this clear.

A rarefaction curve based on the eighteen assemblages listed in Table 5.1 con-

structed using Holland™s (2005) Analytical Rarefaction is shown in Figure 5.8 and

suggests that, given a total NTAXA of twenty-eight for the area represented by those

eighteen collections, none of the collections contains all twenty-eight taxa, most

collections contain very few taxa, but none contain too few for their size, and four

(45DO214, 45DO326, 45DO211, 45DO285) of the eighteen collections seem to con-

tain more taxa than they should given their size ( NISP). The rarefaction curve

measuring the taxonomic structure and composition 191

figure 5.8. Rarefaction analysis of eighteen assemblages of mammal remains from eastern

Washington State using Holland™s (2005) Analytical Rarefaction. Data from Table 5.1 .

thus reveals something we didn™t know before because it presents the data in a

unique, interpolated way. If I were analyzing these collections, I would try to deter-

mine why four of the collections were unexpectedly taxonomically rich; perhaps they

are temporally unique, functionally/behaviorally unique, or located in a particular

microenvironment.

Were one to perform a rarefaction analysis like that shown in Figure 5.8, one should

¬rst determine if those assemblages are nested. Recall from Chapter 4 that in a series

of perfectly nested faunas, successively smaller faunas will have fewer of the taxa

represented in those faunas that are successively larger, and larger faunas will have

all those taxa represented in smaller faunas plus additional taxa. The interpretive

assumption is that nested faunas all derive from the same parent population. There

are ways to test the degree of nestedness of faunas. That has been done with the faunas

in Figure 5.8; consider Figure 4.12, which shows that the nestedness “temperature”

for this set of eighteen faunas is 18.23 —¦ , meaning the faunas are relatively strongly

nested. The rarefaction analysis in Figure 5.8 thus seems reasonable, if one is willing

to allow an unknown degree of skeletal specimen interdependence and, thus, allow

a bit of statistical sloppiness.

quantitative paleozoology

192

Taxonomic Heterogeneity

Several indices have been developed to measure taxonomic heterogeneity. Paleozool-

ogists have tended to use only two of these, although there are several different ones

that are occasionally mentioned (e.g., Andrews 1996). By far the most popular one

among zooarchaeologists is the Shannon’Wiener index, sometimes referred to as

the Shannon index. It generally varies between 1.5 and 3.5 (Magurran 1988:35); larger

values signify greater heterogeneity. The Shannon index is calculated as:

H =’ Pi (ln Pi ),

where Pi is the proportion (P) of taxon i in the assemblage. The proportion (some-

times referred to as “importance”) of each taxon in the collection is multiplied by

the natural log of that proportion. Because proportions are < 1, transforming those

values to natural logs results in a negative sign. Values of the products of the multi-

plications are summed, and then converted from a negative value to a positive value

by the negative or ““” sign in front of the summation ( ) sign.

Let™s say we want to determine the taxonomic heterogeneity (at the genus level)

of the total Meier site mammal collection (Table 4.2). The data and mathematical

steps for calculating the value of the Shannon’Wiener index are summarized in

Table 5.6. NTAXA for this collection is 26. The heterogeneity index is 1.556, suggesting

the total Meier site mammal collection is somewhat heterogeneous. For comparative

purposes, consider the fact that the Shannon’Wiener heterogeneity index for the

total Cathlapotle collection, without distinction of the Precontact and Postcontact

assemblages (Table 4.3), has a value of 1.487, indicating that the heterogeneity of the

Cathlapotle collection is a bit less than that of the Meier collection. One contributing

factor here is that the Meier collection, with twenty-six taxa, is taxonomically richer

than the Cathlapotle collection, which contains remains of only twenty-four taxa.

Does a difference in the evenness of the two assemblages also contribute to the dif-

ference in heterogeneity? To answer that question requires calculation of an evenness

index for each collection.

Because heterogeneity is a function of taxonomic richness and evenness, it is

possible that heterogeneity will also be a function of sample size (e.g., Grayson 1981b).

Thus, if one wishes to measure heterogeneity and compare that variable across several

different samples, it is advisable to determine if there is any relationship between the

measures of heterogeneity and NISP for a set of samples. Once again, consider

the eighteen assemblages from eastern Washington State (Table 5.1 ). The relationship

between sample size per site and heterogeneity per site is, in the case of these eighteen

measuring the taxonomic structure and composition 193

Table 5.6. Derivation of the Shannon’Wiener index of heterogeneity for the

Meier site (original data from Table 4.2). Logs are natural logarithms

Taxon NISP Proportion (p) Log of p p(log p) Running sum

’5.878 ’0.016 ’0.016

Scapanus 18 0.00280

’5.878 ’0.016 ’0.032

Sylvilagus 18 0.00280

’6.822 ’0.007436 ’0.039

Aplodontia 7 0.00109

’8.740 ’0.001398 ’0.041

Tamias 1 0.00016

’8.079 ’0.002504 ’0.043

Tamiasciurus 2 0.00031

’6.571 ’0.009199 ’0.053

Thomomys 9 0.00140

’2.933 ’0.156 ’0.209

Castor 342 0.05326

’5.212 ’0.028 ’0.237

Peromyscus 35 0.00545

’8.740 ’0.001398 ’0.238