23
Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart Berlin Paris San Francisco

Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

Embed Size (px)

Citation preview

Page 1: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

Simple Visualizations of Paired Comparisons

Spencer Graves, PDF Solutions, San Jose, CA

Hans-Peter Piepho, University of Hohenheim, Germany

San Jose Stuttgart

Berlin

Paris

San

Fra

ncis

co

Page 2: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

2Yield, Performance, Profitability

Two Methods

Convert a table of logicals (TRUE or FALSE) indicating which pairs of k objects are / are not distinct p-value or distance exceeding a threshold (or

correlation less than a threshold) Into an easily decoded visual

Two alternatives Letter-based representation (Piepho 2004, JCGS 13:

456-466) “Neighbor” or “Indistinct Ts” (similar to

“undifferentiated classes”; Donague 2004) R Package “multcompView” (www.r-project.org)

* John R. Donaghue (2004) “Implementing Shaffer’s multiple comparison procedure for a large number of groups”, pp. 1-23 in Benjamini, Bretz and Sarkar (eds) Recent Developments in Multiple Comparison Procedures (Institute of Mathematical Statistics Lecture Notes-Monograph Series vol. 47)

Page 3: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

3Yield, Performance, Profitability

b0

b1

b2

b3

6 8 12 16

b3.b0 b2 b1 b3.b0 b2 b1

as Ts

as Rectangles & Triangles

multcompTs

Each “level of a factor” (boxplot) is associated with a column of the

display which indicates whether each other level is or is not distinct

Example: b0 is distinct from b2 but not b1

or b3 b3 has the same

“undifferentiated pattern” as b0

Each “T” or triangle points to the level(s) that column represents

library(multcomp)(minutes~blanket, recovery)

Post-surgery recovery time (min.) with different heating blankets (b1, b2, b3) vs. a standard blanket (b0).

Page 4: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

4Yield, Performance, Profitability

b0

b1

b2

b3

6 8 12 16

a

a

a

b

b

a baba

Ts

“Letters” displayed

as rectangles

Letters

multcompLetters

Rows that share “Letters” are NOT distinct

Example: b1 is not distinct from any of the other

levels, because it shares “a” with b0 and b3, and it shares “b” with b2

b0 is not distinct from b1 or b3 but is from b2

Page 5: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

5Yield, Performance, Profitability

b0

b1

b2

b3

6 8 12 16

a

a

a

b

b

a baba

Ts

“Letters” displayed

as rectangles

Letters

recovery: minutes ~ blanket

dataPOWER matches Piepho’s Letters (and JMP) in this case

b3b2b1b0

b0

b1

b2

b3

Page 6: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

6Yield, Performance, Profitability

Ts vs. Letters

b0

b1

b2

b3

6 8 12 16

a

a

a

b

b

a baba

Ts Letters

Pro

Simple, Easily decoded visually

* Easily added to a text table

* Less space than “Ts”?

Con

Can’t add to a simple text table without graphics

Requires more cognitive processing

Page 7: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

7Yield, Performance, Profitability

dru

gE

dru

gD

4tim

es

2tim

es

1tim

e

5 10 15 20 25

a

ab

bc

c

d

Cholesterol Study 5 different treatments

1time = 20 mg once/day 2times=10 mg twice/day 4times = 5mg 4 times/day drugD, drugE = 2 other drugs

drugE produced the greatest reduction, and is the only treatment distinct from all others

For “2times” and “4times”, we must read 2 different “letters” but only one “T” each.

Reduction in Cholesterol

See the “multcomp” package in R or Peter H. Westfall, Randall D. Tobias, Dror Rom, Russell D. Wolfinger, and Yosef Hochberg (1999) Multiple Comparisons and Multiple Tests (SAS Institute)

Page 8: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

8Yield, Performance, Profitability

dru

gE

dru

gD

4tim

es

2tim

es

1tim

e

5 10 15 20 25

a

ab

bc

c

d

cholesterol: response ~ trt in dataPOWER

dataPOWER matches Piepho’s Letters (and JMP) in this case

1time

2times

3times

4times

drugD

drugE

Page 9: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

9Yield, Performance, Profitability

R Foundation for Statistical Computing www.r-project.org R is the platform of choice for an increasing number of the

leading experts in statistical computing 723 contributed packages downloadable from ‘CRAN’ (2006.04.30)

58 mirrors in 24 countries (2006.04.30)

The availability of downloadable R code substantially reduces the time to learn, apply, modify and extend existing statistical techniques.

You can increase your chances that people like me will read and cite your work if you publish in journals with articles, data and R scripts freely downloadable

or books with companion R packages

Page 10: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

10Yield, Performance, Profitability

multcompView package in R DataPaired Comparison

Summary

multcompTs multcompLetters multcompBoxplot

function

object of class

multcompTs multcompLetters

Plot Printb

0b

1b

2b

3

6 8 12 16

a

a

a

b

b

a baba

Page 11: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

11Yield, Performance, Profitability

Summary: R package “multcompView”

Two visual summaries of paired comparisons relative to a threshold:

multcompTs: Easily decoded visual multcompLetters: Parsimonious, letter-based

summary that does not require graphics

multcompBoxplots: General function for producing variations of either or both with boxplots.

Page 12: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

12Yield, Performance, Profitability

Appendix

Page 13: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

13Yield, Performance, Profitability

CB

TC

on

tF

T

75 80 85 90 95 100

a

a

b

b

anorexia: postwd ~ Treat

dataPOWER matches Piepho’s Letters (and JMP) in this case

CBT

Cont.

FT

FT

Cont.

CBT

Page 14: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

14Yield, Performance, Profitability

LM

H

10 20 30 40 50 60 70

a

a

b

warpbreaks: breaks ~ tension

dataPOWER matches Piepho’s Letters (and JMP) in this case

H

L

M

H

M

L

Page 15: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

15Yield, Performance, Profitability

Number of Defects

Sample Quantiles

No

rma

l Pro

ba

bili

ties

0 50 100 150

5e-04

0.005

0.05

0.5

0.95

0.995

0.9995

-4

-2

0

2

4

Qu

an

tile

s o

f th

e S

tan

da

rd N

orm

al

Number of defects of a certain type

Defects = counts range from 0 to 170 Normal plot on a log scale after

Defects==0 were replaced by 0.1. 0.1 chosen to place Defects=0 roughly

on the line on a normal probability plot We use multcompBoxplot, ignoring

the obvious violations of assumptions involved in applying a normal-theory ANOVA to log(counts) Before we used the results, we’d want

to repeat the analysis using more software for generalized linear mixed models more appropriate to these data.

Number of Defects

Sample Quantiles

No

rma

l Pro

ba

bili

ties

0.1 0.5 1.0 5.0 10.0 50.0

5e-04

0.005

0.05

0.5

0.95

0.995

0.9995

-4

-2

0

2

4

Qu

an

tile

s o

f th

e S

tan

da

rd N

orm

al

Page 16: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

16Yield, Performance, Profitability

o14o6

o17o2

o10o4

o15o3

o19o9

o12o13o16o11

o8o7

o24o21o22

o5o18o20

o1o23

-2 0 2 4

aaaaaaaaaaaaaaaaa

a

b

bb

bbbbbbbbbbbbbbb

c

cc

ccccccc

d

dd

d

d

dddd

ee

log(Defects) ~ operator 24 operators From the “Ts”,

it appears that 2 are better than 6 others o1 and o5 are

better than o15, o10, o2,

o17, o6, and o14

Moreover 2 operators seem WORSE than all others: o6 and o14

Page 17: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

17Yield, Performance, Profitability

o14o6

o17o2

o10o4

o15o3

o19o9

o12o13o16o11

o8o7

o24o21o22

o5o18o20

o1o23

-2 0 2 4

aaaaaaaaaaaaaaaaa

a

b

bb

bbbbbbbbbbbbbbb

c

cc

ccccccc

d

dd

d

d

dddd

ee

log(Defects) ~ operator From the Letters:

It’s clear that o6

and o14 are worse than all the others

It’s NOT obvious that o2 and o5 are the only operators significantly better than 6 others. Conclusion:

Letters are more concise: 5 columns vs. 9 in this example Ts are easier to read.

Page 18: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

18Yield, Performance, Profitability

o14o6

o17o2

o10o4

o15o3

o19o9

o12o13o16o11

o8o7

o24o21o22

o5o18o20

o1o23

-2 0 2 4

log(Defects) ~ operator The same

display as boxes (and triangles for the bases of Ts) Same

information More easily

read: It seems easier to see that o15 is different from o1 and o5

Page 19: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

19Yield, Performance, Profitability

log(Defects) ~ operator (dP sort) dataPOWER does not match:

o02 and o08 are NOT different according to

Ts & Letters IS different per dataPOWER

o24o23o22o21o20o19o18o17o16o15o14o13o12o11o10o09o08o07o06o05o04o03o02o01

-2 0 2 4

o02 and o08 do NOT both have

boxes in the same column per dataPOWER

but DO in “Letters”

Page 20: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

20Yield, Performance, Profitability

log(Defects) ~ operator dataPOWER does not match:

o02 and o08 are NOT different according to

Ts & Letters IS different per dataPOWER

o24o23o22o21o20o19o18o17o16o15o14o13o12o11o10o09o08o07o06o05o04o03o02o01

-2 0 2 4

Same for o08 with o05, o07, o15, o16, o21, o22, and o23.

JMP matches Piepho’s Letters (and NOT dP)

Page 21: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

21Yield, Performance, Profitability

Same example in JMP

o14o6

o17o2

o10o4

o15o3

o19o9

o12o13o16o11

o8o7

o24o21o22

o5o18o20

o1o23

-2 0 2 4

Page 22: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

22Yield, Performance, Profitability

o14o6

o17o2

o10o4

o15o3

o19o9

o12o13o16o11

o8o7

o24o21o22

o5o18o20

o1o23

-2 0 2 4

Same example in JMP

Conclusion: JMP matches dP the “operator” codes are not the same, because I changed the

codes & didn’t regenerate the analysis.

Page 23: Simple Visualizations of Paired Comparisons Spencer Graves, PDF Solutions, San Jose, CA Hans-Peter Piepho, University of Hohenheim, Germany San Jose Stuttgart

23Yield, Performance, Profitability

What next? Short term:

Ignore the discrepancy in the display in the more complicated example

Fix the documentation so someone can make sense of it. Intermediate term: Fix the dP TukeyHDS code Long term:

Fix the Tukey display to match Piepho’s letters Some customers know how to read those things, so we

shouldn’t just change it. Even longer term:

Consider adding the “Ts” in some version It may be easier to read, but customers want other things

more than this.