9
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=ucha20 Download by: [University of Illinois at Urbana-Champaign] Date: 11 September 2015, At: 10:01 CHANCE ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal homepage: http://www.tandfonline.com/loi/ucha20 Estimating Rates at Which Books Are Mis-Shelved Hongmei Liu, Jay Parker & Wei Sun To cite this article: Hongmei Liu, Jay Parker & Wei Sun (2011) Estimating Rates at Which Books Are Mis-Shelved, CHANCE, 24:1, 36-43, DOI: 10.1080/09332480.2011.10739850 To link to this article: http://dx.doi.org/10.1080/09332480.2011.10739850 Published online: 02 Aug 2013. Submit your article to this journal Article views: 22 View related articles

ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

Full Terms & Conditions of access and use can be found athttp://www.tandfonline.com/action/journalInformation?journalCode=ucha20

Download by: [University of Illinois at Urbana-Champaign] Date: 11 September 2015, At: 10:01

CHANCE

ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal homepage: http://www.tandfonline.com/loi/ucha20

Estimating Rates at Which Books Are Mis-Shelved

Hongmei Liu, Jay Parker & Wei Sun

To cite this article: Hongmei Liu, Jay Parker & Wei Sun (2011) Estimating Rates at Which BooksAre Mis-Shelved, CHANCE, 24:1, 36-43, DOI: 10.1080/09332480.2011.10739850

To link to this article: http://dx.doi.org/10.1080/09332480.2011.10739850

Published online: 02 Aug 2013.

Submit your article to this journal

Article views: 22

View related articles

Page 2: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

36 VOL. 24, NO. 1, 2011

Estimating Rates at Which Books Are Mis-ShelvedHongmei Liu, Jay Parker, and Wei Sun

The basic goal of survey sampling is to draw inferences on selected parameters, or characteristics, of

a population. Usually, it is impractical or impossible to examine all the individu-als in an entire population. By using a controlled random sampling strategy, one can examine a sample selected from a population, which takes less time, costs less money, and is operationally simpler than doing a census. It also achieves acceptably accurate results.

Basic ideas of survey sampling were used in a project for an introductory survey sampling course at the University of Illinois-Chicago, taught and super-vised by Samad Hedayat, in the fall of 2009. The task was to estimate the mis-shelving rate of books at the university library. The following relates the story of that exercise.

Survey Design To draw a representative sample that can reflect on the population with reasonable accuracy, we need to select a suitable sampling design. Let U = {1, 2, …, N} index a finite population of N distinct and identifiable units. We refer to N as the size of the finite population. The ele-ments of U—namely 1, 2, …, N— also are known as the sampling units. Here, identifiability means there is a 1:1 cor-respondence between the units and the indexes 1,2, …, N. The sampling frame

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s at

Urb

ana-

Cha

mpa

ign]

at 1

0:01

11

Sept

embe

r 20

15

Page 3: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

CHANCE 37

is a list of all the N population units from which a sample is drawn. There is a study variable Y that assumes values Y1, Y2 ...,YN on the N units in the population. For example, in the study of mis-shelved books, we are interested in the mis-shelv-ing rate. In this case, the total number of books is N, and Yi= 1 if the ith book is mis-shelved, Yi= 0 otherwise; then,

refers to the mis-shelving rate per book. Here, all the N books are distinct and identifiable.

A duplicate copy of a title counts as a separate book for this application. In reality, the mis-shelving rate per book is small. Therefore, it is often expressed in terms of a suitably defined larger col-lection (of books). Herein, we will work with the shelves as the reference popu-lation units, since these define natural collections of books.

For a reasonably moderate collection of identifiable reference units, such as books or shelves, simple random sam-pling without replacement (SRSWOR) is recommended over taking a conve-nience or opportunistic sample. Once the sample size n (in terms of the number of reference units to be sampled) has been determined, the SRSWOR method ensures equal chance of all possible com-binations of n units out of the totality of all reference units. SRSWOR can be implemented by putting index cards with identifying information on each reference unit into a box, or by putting electronic records into a computer, and randomly selecting n indexes sequen-tially without replacement with equal probability at each draw.

Most often, however, alternative, more convenient, and accurate sampling methods are available for a large collec-tion of units. One method is referred to as stratified simple random sampling (STRSRS). According to this method, the whole population of reference units is divided into a number of subpopula-tions, called strata, and independent simple random samples are conducted

in each stratum. We denote by L the number of strata and assume there are Mh population units in the hth stratum, h = 1, 2, …, L.

We obtained preliminary informa-tion on mis-shelving rates and frequency of use on different book collections by consulting librarians and reading litera-ture. Because different collections have different frequency of use and some col-lections naturally result in a low number of mis-shelved books, we divided the library into six collections by physi-cal location and frequency of use. Each such collection served as a stratum in our study.

The collections for our study com-prised of 1st floor south, 2nd floor south, 2nd floor north, 3rd floor north, 4th floor south, and 4th floor north. Those collections covered all the areas in the library, excluding the noncir-culated items and government docu-ments, which we excluded from our study. In the above notation, we have L = 6 strata. If each stratum were ame-nable to conducting a SRSWOR, we could have done so and combined the estimates from each stratum to estimate the overall misclassification rate. Strati-fied sampling often is more efficient than simple random sampling, because it explicitly removes the between-strata variation from the estimation.

At this stage, we realized the stratum sizes were large and still not easy to sam-ple. The books, however, were conve-niently located in stacks of bookshelves, running into hundreds and thousands of bookshelves. Moreover, the book-shelves were easy to list and contained a more-or-less constant number of books (30) per shelf. Although there was slight stratum-to-stratum variation in this num-ber, this was approximately true for each stratum. Also, it is not difficult for a single person to check the shelving order of 30 books.

Within a stratum, in survey sampling terminology, the population of books is organized into clusters of books. From within each stratum, we decided to select a few clusters and inspect the status of all the books in each selected cluster. Cluster sampling is useful in

N

YY i

i∑=

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s at

Urb

ana-

Cha

mpa

ign]

at 1

0:01

11

Sept

embe

r 20

15

Page 4: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

38 VOL. 24, NO. 1, 2011

Table 1—Precision Test of Sample Size

Samples Total Number of Books Mis-shelving Rate per Shelf ( ) s.e. of

SRSWOR(9945, 100) 3,110 0.80 0.0891

SRSWOR(9945, 1000) 29,886 0.71 0.0250

Table 2(a)—Mis-Shelving Rate by Column and Row Positions of the Shelf in Book Stacks

Data Row\Column Bottom Middle Top Total

# of Shelves End 87 130 104 321

Middle 151 260 168 579

# of Books per Shelf End 28 29 30 29

Middle 31 30 30 30

# of Mis-Shelved Books per Shelf End 0.67 0.63 0.74 0.68

Middle 0.86 0.66 0.68 0.72

Mis-Shelving Rate per Book End 2.36% 2.18% 2.50% 2.33%

Middle 2.78% 2.20% 2.30% 2.38%

Total # of Shelves 238 390 272 900

Average # of Books per Shelf 30 30 30 30

Average # of Mis-Shelved Books per Shelf 0.79 0.65 0.70 0.70

Mis-Shelving Rate per Book 2.63% 2.19% 2.37% 2.36%

Table 2(b)—Chi-Square Tests Comparing Mis-Shelving Rate of the Shelves

Row Column # of Mis-Shelved Books Total # of Books p-Value

End

Bottom 58 2456 Reference

Middle 82 3768 0.6006

Top 77 3081 0.7263

Middle

Bottom 130 4679 0.3127

Middle 172 7825 0.6893

Top 114 4967 0.8059

hMis-shelving Rate per Shelf ( )hMis-shelving Rate per Shelf ( )φMis-shelving Rate per Shelf ( )φMis-shelving Rate per Shelf ( )^hφs.e. of φs.e. of

^

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s at

Urb

ana-

Cha

mpa

ign]

at 1

0:01

11

Sept

embe

r 20

15

Page 5: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

CHANCE 39

^

situations in which the frame of the population under study is either not readily available or largely inadequate for the purpose of sampling.

Indeed, we could not be sure of the exact shelf location of each book based on its electronic record. Instead, the population units are found to be conveniently grouped into several natural clusters. In such cases, splitting the population into representative clusters can make sam-pling more practical. Then, we could simply select one or a few clusters at random and perform a census within each of the clusters.

A potential disadvantage of clus-ter sampling is that the units within a cluster can be quite similar. For example, a remote shelf with old vol-umes on it might be relatively undis-turbed relative to the general library. One could potentially learn more about the overall mis-shelving rate by taking 30 randomly selected books, rather than 30 from one shelf. Still, cluster sampling is often used due to its practical advantages.

In summary, the design used in the project was a stratified single-stage cluster sampling design. That is, first a STRSRS design was used to select clusters (the book shelves in our study) within each stratum (book collection). Then, a census of books in the selected clusters was performed to determine the accuracy of shelving. In effect, we used the shelves as clusters of books and used the shelves as reference units for sampling and inference.

Definition of Outcome Variable Recall that the parameter of interest is the mis-shelving rate. The under-standing was that a mis-shelved book would be one whose call number was larger than the call number of the adjacent books on its right or smaller than the call number of the adjacent books on its left. That is, if {1, 2, 3, 4, 5} is the correct order, then for the sequence {1, 2, 5, 3, 4}, only unit {5} is considered mis-shelved.

Our definition included identifiers such as volume numbers in a series of bound volumes of a periodical. We also accounted for books located on the floor close to bookshelves or on

top of other books. We treated a shelf of books as forming a natural cluster. Normally, a book stack consisted of seven rows and 13 columns of shelves on both sides. A shelf is a natural, easy to identify and locate cluster.

Also one does not have to account for borrowed, circulating, or lost books when selecting a sample of nonempty shelves, as one does with selecting a sample of individual books. In many contexts, clearly defining variables to be measured is an important compo-nent of study development.

Sample Size and Pilot Study To check whether it is worthwhile to allocate sample size (i.e., num-ber of shelves to be selected from each stratum of shelves) proportional to the stratum size and, further, for example, whether 100 shelves make up a suitable sample size, we con-ducted a precision test on the 4th floor south collection. This collec-tion had 9,945 shelves in total. We obtained a sample of 100 shelves using an SRSWOR(9945, 100) design. We then sampled another 900 shelves using an SRSWOR design from the remaining subpopulation of 9,845 shelves. Combining these two samples produces a sample of 1,000 shelves belonging to the class of SRSWOR(9945,1000) designs. We recorded both the number of books per shelf and the number of mis-shelved books per shelf. Table 1 summarizes the comparison between the samples.

As seen in Table 1, the precision or standard error with 100 shelves was about 10 times that of the sample with 1,000 shelves (as expected), but it still was quite low. The standard error (s.e.) of i s approx imate ly the square root of ,

which is a formula from survey sampling that accounts for sampling units from a finite population. Therefore, the deci-sion was made to opt for a sample of 100 shelves from each stratum as it meets our needs of reducing data col-lection time while keeping a satisfac-tory precision level.

Moreover, we decided to work out an estimate of the mis-shelving rate per shelf, rather than per book, since the latter was small. In every stratum, an estimate of the exact mis-shelving rate per shelf is the ratio of the total num-ber of mis-shelved books arising out of the sampled shelves over the number of sampled shelves. Under the assumption that, on an average, the book volume is 30 books per shelf, we also can work out an estimate of the exact mis-shelving rate per book, once the rate is computed per shelf. Moreover, using exact sizes of the sampled shelves would result in computation of a ratio of the form total number of mis-shelved books arising out of the sampled shelves over total num-ber of books arising out of the sample shelves. This would provide an estimate of the rate per book only in an approxi-mate sense. The estimate would behave like a ratio estimate. We pursue this latter computation below only to check the variation in the final results on rate per book.

Data Description Table 2(a) and Table 2(b) show com-parisons of shelves with different column and row positions on book stacks to check whether physical locations of the shelves make any difference, using the 900 shelves sampled from the 4th floor south. Column position was determined by proximity to the top or bottom. Those shelves within two levels from the top were considered column-top, and those within two levels from the bottom were considered column-bottom. Other-wise, the shelves were considered column-middle. Position was simi-larly defined for rows. Those rows within two of the left end or right end of the row were considered row-end and everything else was considered row-middle.

Statistical tests show there was no significant difference in terms of mis-shelving rates among shelves of different row and column positions, although column-middle seemed to have a lower mis-shelving than col-umn-top and column-bottom. There-fore, a shelf of books seemed like a reasonable cluster. The shelves were used as the primary sampling units.

h

hh

m30)30(

^^φφ − )1(

oh

h

Mm

^

10

h

hh

m30)30(

^^φφ −30

30 )1(oh

h

Mm

− o

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s at

Urb

ana-

Cha

mpa

ign]

at 1

0:01

11

Sept

embe

r 20

15

Page 6: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

40 VOL. 24, NO. 1, 2011

Table 3—Frequency of Mis-Shelved Books by Collection

CollectionNumber of Mis-Shelved Books on a Shelf

0 1 2 3 4 5 6 7 Total

1st Fl South 59% 19% 14% 5% 3% 0% 0% 0% 100%

2nd Fl South 45% 29% 13% 10% 3% 0% 0% 0% 100%

2nd Fl North 46% 27% 11% 7% 5% 3% 0% 1% 100%

3rd Fl North 46% 31% 16% 5% 2% 0% 0% 0% 100%

4th Fl South 52% 28% 14% 2% 3% 0% 1% 0% 100%

4th Fl North 47% 28% 10% 10% 4% 0% 1% 0% 100%

Overall 49.2% 27.0% 13.0% 6.5% 3.3% 0.5% 0.3% 0.2% 100%

Figure 1. Frequency of mis-shelved books by collection (number of mis-shelved books on a shelf : between 0 and 7)

Figure 2. Boxplot of phi by collection Y-axis: Phi = Number of mis-shelved books on a shelf / Total number of books on the same shelf

Collection

PhiD

ownl

oade

d by

[U

nive

rsity

of

Illin

ois

at U

rban

a-C

ham

paig

n] a

t 10:

01 1

1 Se

ptem

ber

2015

Page 7: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

CHANCE 41

For each collection, we applied SRSWOR design to select 100 shelves. For each selected shelf, the mis-shelving rate (the ratio of the number of mis-shelved books to the total number of books on the same shelf) was computed.

Analysis PhaseTable 3 and Figure 1 show that most of the shelves had fewer than two books mis-shelved across all the collections. Overall, 49.2% of the shelves had no

books mis-shelved, 27% had only one book mis-shelved, and 13% had two books mis-shelved.

As we can see from Figure 1, 2nd floor north and 2nd floor south had the lowest percentages of shelves with fewer than two books mis-shelved, along with a high percentage of shelves with more than two books mis-shelved. The 1st floor south and 4th floor south had the highest percentage of shelves with fewer than two books mis-shelved.

Figure 2 shows the mis-shelving rate on single shelves by collection. The

mean level of the 4th floor south col-lection was much lower than the other collections and also had the smallest spread. One can clearly see that the distributions were right skewed and the 3rd floor north collection had an extremely high outlier. Checking the data, the 3rd floor north collection had one shelf with a high mis-shelving rate because it had three out of its seven books mis-shelved.

Table 4 shows distribution of book volume of the shelves by the six strata. It indicates there is no strong linear

Table 4—Data Summary by Collection and Book Volume per Shelf

Data Collections (strata)

Book Volume per Shelf<−10 (10,20] (20,30] (30,40] >40 Overall

# of Shelves 1st Floor S. 7 27 38 22 6 100

2nd Floor S. 0 16 56 26 2 100

2nd Floor N. 0 9 43 46 2 100

3rd Floor N. 2 31 57 9 1 100

4th Floor S. 1 7 43 37 12 100

4th Floor N. 1 14 52 25 8 100

# of Books per Shelf 1st Floor S. 9 17 25 35 50 25

2nd Floor S. / 18 26 34 47 27

2nd Floor N. / 18 26 35 44 30

3rd Floor N. 8 17 26 34 41 23

4th Floor S. 8 17 27 35 46 31

4th Floor N. 8 17 26 34 44 28

# of Mis-shelved Books per Shelf

1st Floor S. 0.57 0.19 0.76 1.05 2.17 0.74

2nd Floor S. / 0.75 0.91 1.19 1.50 0.97

2nd Floor N. / 0.33 1.30 1.13 0.50 1.12

3rd Floor N. 1.50 0.68 0.86 1.11 3.00 0.86

4th Floor S. 0.00 0.29 0.58 0.89 1.67 0.80

4th Floor N. 0.00 0.14 0.85 1.32 2.63 1.00

Mis-shelving Rate per Book

1st Floor S. 6.06% 1.12% 3.05% 2.95% 4.38% 2.91%

2nd Floor S. / 4.20% 3.50% 3.56% 3.19% 3.58%

2nd Floor N. / 1.90% 5.00% 3.27% 1.15% 3.79%

3rd Floor N. 18.75% 4.06% 3.37% 3.27% 7.32% 3.68%

4th Floor S. 0.00% 1.71% 2.16% 2.58% 3.64% 2.57%

4th Floor N. 0.00% 0.82% 3.28% 3.83% 5.92% 3.56%

# of Shelves 11 104 289 165 31 600

Avg. # of Books per Shelf 9 17 26 34 46 27

Avg. # of Mis-shelved Books per Shelf 0.64 0.43 0.88 1.10 1.97 0.92

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s at

Urb

ana-

Cha

mpa

ign]

at 1

0:01

11

Sept

embe

r 20

15

Page 8: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

42 VOL. 24, NO. 1, 2011

^

relationship between book volume and mis-shelving rate.

To further investigate the impact of book volume per shelf on mis-shelving rate, we used the Wald chi-square test, conducted with SAS SURVEYFREQ procedure, to analyze the data. The shelves were divided into five groups according to their book volume, as we did in Table 4, and we had six strata. According to the chi-square test results, the association between book volume of the shelves and mis-shelving rate is not significant (p-value = 0.1750, X2

4=

6.3653, Adjusted F = 1.5833). This analysis procedure took into consider-ation the complex survey design with stratification and clustering.

Accounting for the Complex Sample Design: Statistical Inference Now we look at statistical inference for mis-shelving rates. When a sample is greater than 5% of the population from which it is being selected and the sample is chosen without replacement, the finite population correction factor should be used. The central limit theorem and standard errors of the mean and of the proportion are based on the premise that the samples selected are chosen with replacement. However, in virtually all survey research, sampling is conducted without replacement from populations that are of a finite size N. In these cases, particularly when the sample size n is not small in comparison with the population size N (i.e., more than 5% of the popu-lation is sampled) so that n/N > 0.05, a finite population correction factor is used to define both the standard error of the mean and the standard error of the proportion. If n denotes the sample size and N denotes the population size under SRSWOR sampling, then f = n/N is known as the “sampling fraction.” Under SRSWR sampling, the sampled units behave as independently and iden-tically drawn units and the expression for the variance of the sample mean assumes a very simple form,

n

.

However, under SRSWOR sampling, there is an intrinsic dependence of the sampled units among themselves. In that

case, the expression for the variance of the sample mean assumes the form

nf

NN )1(

)1(

2 −−

σ,

where f is the sampling fraction. For large N and appreciable value of the sampling fraction (of the order of 5% or more), the formula simplifies to

)1(

2

fn

−σ.

The factor (1-f) is known as the finite population correction factor (abbrevi-ated as fpc). When f is small, we can drop the fpc and the formula reduces to

n

,

same as that under SRSWR sampling. In the above, 2σ refers to the popula-tion variance of the values of the study variable Y.

Table 5 summarizes statistical esti-mation of mis-shelved books and other parameters. Moh is the number of shelves, or clusters. nh is the average number of books per shelf. Mh is the number of shelves rescaled by a factor of (nh /30). Wh is the weight of the hth

stratum

hh

h

MM

.

^

hφ is the estimated mis-shelving rate per shelf.

^Th is the estimated total number of

books = nh Moh. The collection on 2nd floor north

had the highest mis-shelving rate among the six collections we investigated, and the collection on 1st floor south had the lowest mis-shelving rate. The mean mis-shelved books per shelf were estimated to be 0.88 with estimated s.e. to be 0.1588. We have assumed the average shelf length in terms of book volume per shelf is 30. Therefore, an estimate of number of mis-shelved books per 1,000 books would be 29, obtained from that per shelf by mul-tiplying the latter by 1,000/30. Further, the estimated standard error would also have the same inflation factor. The mis-shelving rate we obtained is 2.9%.

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s at

Urb

ana-

Cha

mpa

ign]

at 1

0:01

11

Sept

embe

r 20

15

Page 9: ISSN: 0933-2480 (Print) 1867-2280 (Online) Journal

CHANCE 43

homogeneity within each stratum and heterogeneity among the strata means with respect to the survey variable. If improving precision of the sampling design is the primary concern, then we can do a subject-based stratified sampling. One possible approach would be to define strata in a different method or incorpo-rate a second-stage cluster sampling. In a second-stage cluster sampling, one would subsample books from each cluster. If the shelves held 300 books instead of 30, it would be time consuming to check all 300 books. Rather, one could take a random sample of 30 books from each shelf and examine those books. This second stage of sampling is common in many applications.

Another option with more advance information would be to combine homo-geneous collections of similar mis-shelv-ing rates into one stratum and reduce the number of strata in the sampling design.

Another approach for investigating the mis-shelved books and predict counts of mis-shelved books is using a Binomial-Poisson mixture model with each shelf as the specified interval, if information about more covariates is available.

There are many factors that may have an impact on the mis-shelving rate, includ-ing frequency of use, subjects, timing and flux of users, and the shape of the books. Perhaps mis-shelving is also a result of inexperienced personnel, lack

of attention, fatigue by personnel, or the complexity of some book indexes.

Further Reading

Cochran, W. G. 1977. Sampling techniques. Wiley, NY.

Edwardy, J. M., and J.S. Pontius. 2001. Monitoring book reshelving in librar-ies using statistical sampling and con-trol charts. Library Resources & Technical Services, 45(2):90–94.

Groves, R. M., F. J. Fowler Jr., M. P. Couper, J. M. Lepkowski, E. Singer, and R. Tou-rangeau. 2009. Survey methodology. 2nd edition. Wiley.

Hedayat, A. S., and B. K. Sinha. 1991. Design and inference in finite population sam-pling. New York: Wiley Series in Prob-ability and Mathematical Statistics.

Heeringa, S. G., B. T. West, and P. A. Ber-glund. 2010. Applied survey data analysis. Chapman & Hall.

Jan, S. S., A. W. John, and S. Nackil. 2009. A cost-benefit analysis of a collections inventory project: A statistical analysis of inventory data from a medium-sized academic library. The Journal of Academic Librarianship 35(4):314–323.

Kish, L. 1965. Survey sampling. Wiley. Lohr, S. 2009. Sampling: Design and analysis.

Duxbury.

In the collection on 2nd floor north, the low frequency of use played a different role than expected. While many of the shelves had fewer than two mis-shelved books, there were 10 shelves with four, five, or even seven mis-shelved books. These high numbers were due to books being stacked after the bookend or on top of the books, instead of being placed somewhere the librarian would notice them and reshelve them properly.

DiscussionOur librarians did a good job in keep-ing the overall mis-shelving rate low compared to other public and university libraries, though there were certain areas that needed more effort (e.g., 2nd floor north). Due to time constraints, we did not do a census to compare strategies. There are other strategies we can consider and compare to STRSRS, such as system-atic sampling since books are arranged in arrays and STRSRS with proportional allocation or optimal allocation if we can define a cost structure for the sampling design. We also can look into time effect by inspecting the mis-shelving rate at different periods of the year.

We used stratified single-stage cluster sampling. The drawback of this approach is that our strata were placement based. The amount of gain due to stratifica-tion largely depends on the degree of

Table 5—Statistical Estimation of Mis-Shelved Books

CollectionsStrata (L=6)

ohM ( hm =100) hn hM hW ^

hφs.e. of ^

hTs.e. of

^

hT

1st Floor S. 1,033 25 861 0.0299 0.74 0.0898 25,825 1,043

2nd Floor S. 1,221 27 1,099 0.0382 0.97 0.1016 32,967 749

2nd Floor N. 1,239 30 1,239 0.0431 1.12 0.1088 37,170 755

3rd Floor N. 7,869 23 6,033 0.2096 0.86 0.0924 180,987 5,150

4th Floor S. 9,945 31 10,277 0.3571 0.80 0.0891 308,295 7,915

4th Floor N. 9,528 28 8,893 0.3090 1.00 0.0993 266,784 7,453

Overall 30,835 28 28,779 1 0.88 0.1588 852,082 23,065

^

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s at

Urb

ana-

Cha

mpa

ign]

at 1

0:01

11

Sept

embe

r 20

15