5
© 2011 The Royal Statistical Society Henry Moore’s sculptures – huge works in bronze or carved marble – are iconic. e public know them from museums and public places. Less well known is that Moore, the pre-eminent British sculptor of the 20th century, produced many smaller-scale sculptures, drawings, etchings and lithographs, and that these frequently come up for sale. One perhaps surprising place to find them is on eBay. Guide prices there range from £250 to tens of thousands of pounds. And such eBay Henry Moores are not at all uncommon. On one day in December 2010, no fewer than five were on offer, from apparently separate sellers, and all described as original. e internet has provided consumers with new and easy ways to purchase goods – and the commis- sions charged by an internet auction host are a fraction of those of the major art houses. But it has allowed less scrupulous businesses and individuals to offer poor-quality or mislabelled items. Ideally, before buying works at auction one would have experts examine them, as is done at the big auction-houses like Sotheby’s and Dare you buy a Henry Moore on eBay? Statistics can tell you what to avoid When the rarefied world of modern art sales meets the digital age, almost anything is possible. You, too, can buy a Henry Moore on eBay. But it is risky. The old, high-commission auction-houses have rivals, but you will need statistics to guide you. Joseph Gastwirth and Wesley Johnson tell you where the fakes may be lurking. Mother and Child II (1983) Cramer, Grant and Mitchinson (CGM) catalogue 672 10 march2011

Dare you buy a Henry Moore on eBay - huji.ac.ilpluto.mscc.huji.ac.il/~mszucker/BINARY/henry-moore.pdf · 2014-01-27 · the photograph on eBay with the same “refer-ence photo”

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Dare you buy a Henry Moore on eBay - huji.ac.ilpluto.mscc.huji.ac.il/~mszucker/BINARY/henry-moore.pdf · 2014-01-27 · the photograph on eBay with the same “refer-ence photo”

© 2011 The Royal Statistical Society

Henry Moore’s sculptures – huge works in bronze or carved marble – are iconic. The public know them from museums and public places. Less well known is that Moore, the pre-eminent British sculptor of the 20th century, produced many smaller-scale sculptures, drawings, etchings and lithographs, and that these frequently come up for sale. One perhaps surprising place to find them is on eBay. Guide prices there range from £250 to tens of thousands of pounds. And such eBay Henry Moores are not at all uncommon. On one day in December 2010, no fewer than five were on offer, from apparently separate sellers, and all described as original.

The internet has provided consumers with new and easy ways to purchase goods – and the commis-sions charged by an internet auction host are a fraction of those of the major art houses. But it has allowed less scrupulous businesses and individuals to offer poor-quality or mislabelled items. Ideally, before buying works at auction one would have experts examine them, as is done at the big auction-houses like Sotheby’s and

Dare you buy a Henry Moore on eBay?Statistics can tell you what to avoid

When the rarefied world of modern art sales meets the digital age, almost anything is possible. You, too, can buy a Henry Moore on eBay. But it is risky. The old, high-commission auction-houses have rivals, but you will need statistics to guide you. Joseph Gastwirth and Wesley Johnson tell you where the fakes may be lurking.

Mother and Child II (1983) Cramer, Grant and Mitchinson (CGM) catalogue 672

10 march2011

Page 2: Dare you buy a Henry Moore on eBay - huji.ac.ilpluto.mscc.huji.ac.il/~mszucker/BINARY/henry-moore.pdf · 2014-01-27 · the photograph on eBay with the same “refer-ence photo”

11march2011

Christie’s, but for items listed on eBay, which come from all over the world, this is impracti-cal. One of us ( JLG) has long been interested in art by Moore. After noticing on eBay a small sculpture, supposedly made by Henry Moore, that did not “look right” he checked with friends at the Moore Foundation.

They had received inquiries from buyers who have purchased works incorrectly attrib-uted to Moore; so the question of estimating the prevalence of counterfeit art work arose.

From our informal correspondence it appeared that a much higher percentage of “drawings” or “small sculptures” were dubious than was the case for signed etchings and lithographs (prints). This last detail sug-gested a statistical approach that we could use to estimate the proportion of fake Henry Moores – or “questionable works”, in the more cautious language of the art world – that were out there.

In medical and social science applications, where even the best method of classification is not a “gold standard”, the Hui–Walter method1 can be used to estimate the accuracy rates of clinical tests2 and survey classifications3. That method requires one to study two subpopula-tions, with a different prevalence of the trait in each. The high prevalence group might be individuals who had symptoms of deep vein thrombosis (DVT) while the low prevalence group consists of individuals at risk of DVT

because they recently had surgery. The method also needs at least two evaluators of classifica-tions or tests of success or failure. Furthermore, the evaluations should be independent of each other. In the case of screening for DVT, one test measures the level of antibodies while the other is based on different technologies4.

Here we had our two subgroups – draw-ings and sculpture, where questionable works are common, versus lithographs and etchings, where they are rarer. In order to obtain a suf-ficient and representative sample of Moore’s work, almost all of the objects described as having been created by Henry Moore that ap-peared on eBay during the period from March 2005 to November 2007 (239 of them in all) were assessed. We needed not one but two independent evaluators: the first was Stephen Gabriel, an expert on Moore, and the second was one of the authors ( JLG). Both have had a long-time interest in Moore’s art and have extensive libraries. A third collaborator, Dr H. Hikawa, downloaded the descriptions of each item, which typically included a digital photo, and provided the two evaluators with copies. The files were e-mailed to the first evaluator, while the second evaluator was given a printed version. To further ensure the independence of the evaluations, the two assessors did not discuss any of the art for sale during this period. Because a major objective of the study is to protect consumers against “misleading” descriptions, which suggest that an item is

authentic when it is not, objects that were described as “similar” to or “related” to Moore were excluded. We analysed only objects that were claimed to be actually by Henry Moore.

The data and statistical model

The results of the study were summarized in two 2 × 2 tables reporting the matched pair classifications for the two groups of artwork (Table 1). The drawings and sculpture data, as we have said, were combined because the background information indicated that the prevalence of non-genuine items of both types was similar. Furthermore, the fractions of these two items that both evaluators thought dubious were similar. Table 1 shows, for each group, the items that both evaluators thought questionable; that Stephen Gabriel thought questionable but Gastwirth thought genuine; that Gabriel thought genuine but Gastwirth thought questionable; and that both evaluators thought genuine.

Two things affect the numbers that ap-pear in the tables: the actual prevalence of non-genuine objects in each of the two groups, and the accuracy of the evaluators. This is where having two independent evaluators is so vital: they provide a mutual cross-reference. Suitably statistically treated, each can provide a stand-ard by which to judge the other. Furthermore, the evaluators’ accuracy has two parts. The first is their sensitivity. This is the probability that a

Table 1. Assessments of genuineness of Henry Moore’s art offered on eBay from March 2005 to November 20075

Prints Evaluator 2

Questionable Genuine Total

Evaluator 1 Questionable 6 10 16

Genuine 1 149 150

Total 7 159 166

Sculptures and drawings Evaluator 2

Questionable Genuine Total

Evaluator 1 Questionable 59 6 65

Genuine 2 6 8

Total 61 12 73

Mother and Child VIII (1983) CGM catalogue 678

Page 3: Dare you buy a Henry Moore on eBay - huji.ac.ilpluto.mscc.huji.ac.il/~mszucker/BINARY/henry-moore.pdf · 2014-01-27 · the photograph on eBay with the same “refer-ence photo”

12 march2011

non-genuine object will be classified correctly – that they will spot a fake. The second part is their specificity, which is in some ways the reverse – that they will know a genuine article when they see one. These are very far from be-ing the same thing. An evaluator who classified every item as genuine would have a very high specificity, but a sensitivity of zero.

If one considers the classification of objects in the framework of classical statistics, where the null hypothesis is that the object is genuine and the alternative is that it is not, the Type I error equals 1 minus specificity and the Type II error is 1 minus sensitivity.

The Hui–Walter method takes the data of Table 1 and calculates probabilities of genuineness for objects in each category and calculates also estimates of the accuracies of the evaluators. The virtue of the method is that it gives information both about the evaluators and the evaluated. It assumes that the accuracy rates of each evaluator are the same for both categories of art and that, conditional on the true status of an object, the evaluations are independent. Given that, it provides statistical estimates of the specificity and the sensitivity of each evaluator, and of the fraction of prints and the fraction of drawings that are questionable. The results, with their confidence intervals, are given in Table 2.

Although the confidence intervals for the accuracy rates overlap, they suggest that the evaluators had similar but not identical rates of accuracy. The first evaluator, Stephen Gabriel, was more sensitive, detecting more counterfeit items, while the second, Joseph Gastwirth, had a slightly higher specificity, correctly classifying legitimate items. What is more remarkable is the estimated prevalence of dubious drawings and sculptures: 91.5% of them are questionable. Even taking the lower end of a 95% confidence

interval gives 82% of them questionable. This clearly means that government agencies con-cerned with consumer protection are justified in informing the public of potential authentic-ity issues. In contrast, only 4.1% of the signed prints appear to be of doubtful authenticity. The obvious first lesson is: if you are thinking of buying a Henry Moore on eBay, buy a print rather than a drawing or small sculpture.

While a number of authors have raised questions about the validity of the estimates from latent class models such as the Hui–Walter7, most of the studies indicate that it is the estimates of sensitivity and specificity that are most affected by modest violations of its assumptions; the estimates of prevalence are more sturdy. Furthermore, the greater the dif-ference in the prevalence of the characteristic in the two groups, the greater is the robustness of the prevalence estimate8 – and here our difference is indeed great: between fake rates of 91.5% in drawings and 4.1% in prints lies a difference of 87.4%. We may therefore place some reliance on our conclusions. The key as-sumption is that each evaluator has the same sensitivity and specificity for artworks of both types, and that they are independent.

Although we took pains to ensure that the evaluators worked independently, there are two ways in which a modest degree of dependence could arise. Some sellers may offer multiple objects and, whether by design or ignorance, there is likely to be correlation in the status of the items put on eBay by the same seller. Also, both evaluators probably consulted many of the same definitive cata-logues and books and might have compared the photograph on eBay with the same “refer-ence photo”. To check the potential sensitivity of the results to possible dependence, a model allowing for such correlation was also fitted to

the data9. The estimated correlation was 0.29, which is insufficient to result in a serious bias in the prevalence estimates.

Implications for buyers of artwork

Clearly the results indicate that consumers should not take for granted the authenticity of works by Moore, and probably other major art-ists, that are offered on eBay or other internet sellers, and that they should carefully compare the digital photographs and related informa-tion provided by sellers with the correspond-ing information in the major catalogues. This also applies to Moore’s prints because several that were classified as non-genuine were from an unsigned version where a questionable signature was added. As in all observational studies, there is a possibility that some impor-tant covariates, such as provenance or prior ownership of the item, were not available. It is difficult to think, however, of a realistic covari-ate that could explain the very low prevalence of genuine drawings and small sculptures. The very high proportion of dubious drawings and small sculptures by Moore offered on eBay indicates that prospective buyers of art by other major artists, such as Picasso or Chagall, should also be very careful.

Potential applications in legal cases

After we began the project we became aware of several legal decisions in cases where eBay was sued for assisting the sale of counterfeit products. All the suits involved possible viola-tions of intellectual property and trademark infringement, but the legal criteria used in different nations are not uniform. Moreover, eBay did have a process that allowed firms to report counterfeit items. Statistical evidence had a key role in many of the cases.

In the United States, eBay was found not to have contributed to trademark infringement in Tiffany v. eBay10. Tiffany presented a survey which claimed that about 75% of the items labelled as its product were counterfeit, while only 5% were surely genuine11,12. The courts decided that, even though eBay had general knowledge that counterfeit Tiffany silver jew-ellery was being sold, it was only required to take action if it had contemporary knowledge of which particular listings were infringing or would infringe in the future. Furthermore, the trial court noted significant flaws in Tiffany’s

Table 2. Maximum likelihood estimates of the two prevalence parameters and accuracy rates of the two evaluators. Maximum likelihood estimates were obtained using the EM algorithm with standard errors based on the bootstrap using the program TAGS6

Parameter Mean 95% Confidence interval

Se1, sensitivity of evaluator 1 (Stephen Gabriel) 0.968 (0.877,0.992)Se2, sensitivity of evaluator 2 (Joseph Gastwirth) 0.913 (0.810,0.962)Sp1, specificity of evaluator 1 0.941 (0.889,0.969)Sp2, specificity of evaluator 2 0.995 (0.939,0.999)Prev1, fraction of prints that are dubious 0.041 (0.018,0.089)Prev2, fraction of sculptures and drawings that are dubious 0.915 (0.818,0.962)

Page 4: Dare you buy a Henry Moore on eBay - huji.ac.ilpluto.mscc.huji.ac.il/~mszucker/BINARY/henry-moore.pdf · 2014-01-27 · the photograph on eBay with the same “refer-ence photo”

13march2011

survey. It was not probability-based, so one could not calculate a confidence interval for the fraction of non-genuine items. Furthermore, the search used to identify the items that were purchased and examined by two Tiffany experts included non-silver jewellery as well as the silver items that were at issue in the case. The sample sizes (186 in 2004 and 139 in 2005) were less than those specified by the survey designer. One reason for this shortfall was that Tiffany was unable to purchase some of the items that were supposed to be in their sample. It was quite likely that those “missing items” had a higher probability of being genu-ine than those that they were able to acquire, as knowledgeable individual buyers were also bidding for the genuine pieces but not for the fakes. Finally, Tiffany did not participate in eBay’s monitoring programme during this time, so that items that could have been re-moved from the site were not. Although eBay’s statistical expert agreed that a substantial

amount (at least 30%) of Tiffany jewellery was counterfeit, this only helped establish that eBay had general knowledge that counterfeit products were being sold on its site.

In France, however, a study that was submitted by Christian Dior and Luis Vuitton estimated that 90% of items allegedly made by these designers were not genuine. This study was accepted by the court. This estimate is surprisingly similar to the 91.5% prevalence estimate for non-genuine Henry Moore drawings and small sculptures in our study. Partly on this basis, eBay was found liable for contributing to trademark infringement.

Although surveys have been used to estimate the proportion of potential consum-ers who are “confused” – a polite word for “deceived” – as to the source of a product because of the design or packaging or are misled by advertising, statisticians may not fully appreciate the potential for using sta-tistical surveys and studies similar to ours in

trademark infringement cases. The method we have used here might be adapted to help monitor the authenticity of items offered for sale on the internet.

Potential refinements and improvements to the study design

Our work can be regarded as a proof of prin-ciple: it is possible to obtain reasonable esti-mates of the prevalence of counterfeit items even when the evaluators do not examine the pieces individually. During the time the data were collected, the evaluators observed that some particular art objects came up repeat-edly, and that items from some particular sellers, especially those who sold many items, were more likely not to be authentic. The approach could be improved by incorporat-ing knowledge that is gained during a first phase, either about the type of items that are non-genuine or sellers of those products, into

Two Women Seated on Beach (1984) CGM catalogue 719

Page 5: Dare you buy a Henry Moore on eBay - huji.ac.ilpluto.mscc.huji.ac.il/~mszucker/BINARY/henry-moore.pdf · 2014-01-27 · the photograph on eBay with the same “refer-ence photo”

14 march2011

a second phase study. That study might be a probability-based buying programme that is focused on a smaller group of likely sellers of problematic objects.

When it is possible to obtain a third in-dependent evaluation the latent class approach does not require two subpopulations and has been successfully used to evaluate screening tests and estimate the prevalence of disease in animals. The three-evaluator version is well suited to estimating the prevalence of counter-feit jewellery, as a second subpopulation with a low prevalence of fakes might not exist.

One possible limitation of the method is that an infringing seller might purchase one expensive handbag, say, make counterfeit ver-sions, but put a picture of the genuine bag on the internet. Presumably, a disappointed pur-chaser would complain to eBay, which would inform the company about a particular seller of infringing items. Trademark holders and consumer protection agencies might still find a broad-based study or survey that provided a statistically reliable estimate of the fraction of counterfeit products sold by internet sites useful both in legal cases and to inform policy-makers and the public of the magnitude of the problem.

References1. Hui, S. L. and Walter, S. D. (1980) Esti-

mating the error rates of diagnostic tests. Biometrics, 36, 167–171.

2. Pepe, M. and Janes, H. (2007) Insights into latent class analysis of diagnostic test performance. Biostatistics, 8, 474–484.

3. Sinclair, M. D. and Gastwirth, J. L. (1996) On procedures for evaluating the effectiveness of reinterview survey methods: application to labor force data. Journal of the American Statistical As-sociation, 91, 961–969.

4. Line, B.R., Peters, T. L. and Keenan, J. (1997) Diagnostic test comparisons in patients with Deep Venous Thrombosis. Journal of Nuclear Medicine, 38, 89–92.

5. Gastwirth, J. L., Johnson, W. O. and Hikawa, H. (2011) Estimating the fraction of “non-genuine” artwork by Henry Moore on eBay: application of latent class screening test methodol-ogy. Journal of the Royal Statistical Society, Series A, 174 (in press).

6. Pouillot, R., Gerbier, G. and Gardner, I. A. (2002) “TAGS”, a program for the evaluation of test accuracy in the absence of a gold standard. Preventive Veterinary Medicine, 53, 67–71.

7. Spencer, B. (2010) When do latent class

models overstate accuracy for binary classifiers? (in press).

8. Sinclair, M. D. and Gastwirth, J. L. (2000) Properties of the Hui and Walter and related meth-ods for estimating prevalence rates and error rates of diagnostic testing procedures. Drug Information Journal, 34, 605–615.

9. Dendukuri, N. and Joseph, L. (2001) Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests. Biometrics, 57, 158–167.

10. 576 F. Supp. 2d 463 (S.D.N.Y. 2008) and 600 F. 3d 93 (2d. Cir. 2010).

11. Goldwasser, K. (2010) Knock it off: An analysis of trademark counterfeit goods regulation in the United States, France and Belgium. Cardozo Journal of International and Comparative Law, 18,

207–238.12. Levin, E. K. (2009) A safe harbor for

trademark: Reevaluating secondary trademark liability after Tiffany v. eBay. Berkeley Technology Law Journal, 24, 491–527.

Joseph Gastwirth is Professor of Statistics and Economics at the George Washington University, Washington, DC, and Wesley Johnson is Professor of Statistics at the University of California at Irvine.

Acknowledgements Grateful thanks are due to the Henry Moore Founda-tion (www.henry-moore.org) for their generosity in providing digital images of the artwork. A fuller, more technical version is to appear in the Journal of the Royal Statistical Society, Series A.

Two Reclining Figures in Yellow and Green (1967) CGM catalogue 74