1
●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● 1 5 10 50 100 500 1000 0.0 0.2 0.4 0.6 0.8 1.0 # Free Throw Attempts Free Throw Percentage Sports statistics 2013-14 NBA 461 players 58029 free throw attempts espn.go.com Gene set analysis ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●● ●●● ●● ●● ●● ●● 0.01 0.02 0.05 0.10 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Standard Error Log Odds 10 20 50 100 200 500 1000 0.05 0.10 0.20 0.50 set size N proportion of set detected by RNAi Genome-wide association studies Examples Intuitively, the r-value for unit i is the smallest α such that when ranking units by the posterior probability of being in the top α fraction of the system, unit i remains in the top α fraction of the ranked list. setting: large scale, non-sparse inference examples: genomics/sports/evaluations/... task: rank order the units challenge: differential uncertainty lots of units: data: parameters (of interest): model: p(D i |i ) {i } {D i } {i} Putting lots of things in order: R-VALUES Free Throw Ability density 0.2 0.4 0.6 0.8 1.0 0 5 10 15 25 35 α exceedance probability 0.002 0.005 0.010 0.020 0.050 0.100 0.200 0.500 1.000 0 0.05 0.1 0.2 0.4 0.8 1 ●● ●● ●● ●● ●● ●● ●● ●●● ●● ●●● ●● ●● ●● ●● ●●● ●● ●●● ●● ●●● ●● ●● ●● ●●● ●● ●● ●● ●● ●● ●● ●●●● ●● ●●● ●● ●● ●● P( θ i ≥θ α | D i ) two examples empirical quantile λ ^ α rvalue D Ray.Allen = 105 116 D LeBron.James = 439 585 Table 2. Leading free-throw shooters, 2013-2014 regular season of the National Basketball Association. From 461 players who attempted at least one free throw, shown are the top 25 players as inferred by r-value. Data D i on player i include the number of made free throws y i and the number of attempts n i . Other columns indicate free-throw percentage FTP= y i /n i , which is the maximum likelihood estimate (MLE) of the underlying ability θ i ; posterior mean E (θ i |D i ), r-value inf {α : P (θ i θ α |D i ) λ α }; qualified rank, Q.R, which is the rank of FTP amongst players for whom y i 125; and ranks associated with the MLE, posterior mean, and r-value. player i y i n i FTP PM RV Q.R MLE.R PM.R RV.R Brian Roberts 125 133 0.940 0.913 0.002 1 17 1 1 Ryan Anderson 59 62 0.952 0.898 0.003 15 2 2 Danny Granger 63 67 0.940 0.893 0.005 16 3 3 Kyle Korver 87 94 0.926 0.892 0.008 19 4 4 Mike Harris 26 27 0.963 0.866 0.010 14 15 5 J.J. Redick 97 106 0.915 0.886 0.011 22 6 6 Ray Allen 105 116 0.905 0.880 0.016 25 8 7 Mike Muscala 14 14 1.000 0.844 0.017 7 34 8 Dirk Nowitzki 338 376 0.899 0.891 0.018 2 30 5 9 Trey Burke 102 113 0.903 0.877 0.018 28 9 10 Reggie Jackson 158 177 0.893 0.877 0.024 3 32 11 11 Kevin Martin 303 340 0.891 0.882 0.025 4 33 7 12 Gary Neal 94 105 0.895 0.869 0.025 31 14 13 D.J. Augustin 201 227 0.885 0.873 0.031 5 38 12 14 Stephen Curry 308 348 0.885 0.877 0.031 6 39 10 15 Patty Mills 73 82 0.890 0.860 0.032 34 19 16 Courtney Lee 99 112 0.884 0.861 0.035 40 18 17 Steve Nash 22 24 0.917 0.834 0.039 20.5 44 18 Greivis Vasquez 95 108 0.880 0.857 0.040 41 22 19 Robbie Hummel 15 16 0.938 0.825 0.043 18 55 20 Mo Williams 78 89 0.876 0.850 0.046 42 24 21 Kevin Durant 703 805 0.873 0.870 0.048 7 45 13 22 Aaron Brooks 83 95 0.874 0.850 0.049 44 26 23 Damian Lillard 371 426 0.871 0.865 0.050 8 47 16 24 Nando de Colo 31 35 0.886 0.831 0.057 37 48 25 Construction of ranking variables, NBA example Free Throw Ability density 0.2 0.4 0.6 0.8 1.0 0 5 10 15 25 35 5 10 15 20 25 0.00 0.05 0.10 0.15 0.20 0.25 0.30 t = rank from top E[ similarity_t{ Ranks(theta) , Ranks.hat[midseason] } | complete season ] rvalue posterior mean MLE 1. local posteriors 2.estimated marginal 3.posterior exceedance probability 4.r-value p(i |D i ) p(i ) r (D i ) = inf {: P (i |D i ) λ } P {P (i |D i ) λ } = binomial likelihood beta prior beta posteriors beta (parametric case) r-values rank players better from mid-season data, as validated on complete season data. r = 0.016 R-value ranking maximizes agreement between the true top α fraction and the reported top α fraction, for allα. R package: rvalues http://www.stat.wisc.edu/~newton/ Henderson, N.C. and Newton, M.A. (2013). Making the cut: improved ranking and selection for large-scale inference. arXiv:1312.5776 M.A. Newton and N.C. Henderson University of Wisconsin, Madison, USA D i =(X i , σ 2 i ) E (X i |i , σ 2 i )= i var(X i |i , σ 2 i )= σ 2 i V (x, σ 2 )= P (i |X i = x, σ 2 i = σ 2 ) Assume V (x, σ 2 ) is right-continuous and non-decreasing Measurement model: in x for every and σ 2 Theorem: P {r (D i ) , i } P {T (D i ) , i } for any other ranking variable T (D i ), P [T (D i ) ] Fig. 1. Type-2 diabetes example: From the full complement of 127,903 SNPs used in the second stage meta-analysis from Morris et al. 2012, we filtered to a reduced set of 25,558 SNPs that are probably associated with T2D, and plot 3371 of those having highest observed association (log odds exceeding 0.05). These estimates are based on genotype data from 22,669 T2D cases and 58,119 control subjects. Fig. 2. RNAi example: From a recent version of Gene Ontology, 5719 terms (gene sets) annotate between 10 and 1000 human genes. Shown is a summary of the integration of these terms with the list of 984 genes detected by RNAi as being involved in influenza virus replication (from Hao et al. 2013). The x-axis shows set size and the y-axis shows the proportion of the set that was detected by RNAi. The plot is restricted to 3626 sets for which the observed proportion exceeds 5%. Morris, A. P., B. F. Voight, T. M. Teslovich, T. Ferreira, A. V. Segre, V. Steinthorsdottir, R. J. Strawbridge, H. Khan, H. Grallert, A. Mahajan, et al. (2012). Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nature genetics 44 (9), 981–990. Hao, L., Q. He, Z. Wang, M. Craven, M. A. Newton, and P. Ahlquist (2013). Limited agreement of independent RNAi screens for virus-required host genes owes more to false- negative than false-positive factors. PLoS computational biology 9 (9), e1003235. N[ok] MLE MLE N[ok] MLE pvalue N[ok] MLE posterior mean 1 1/4 1/16 0 1/16 1/4 1 (XR)/(X+R) Fig. 5. Ranking via various methods compared to r-value ranking; RNAi example; data and axes as in Fig. 2. Plotted is (X - R)/(X + R) where X is the rank (from the top) of the set by the method being compared, and R is the rank by r-value. a. MLE b. pvalue c. posterior mean d. maximal agreement Fig. 3. Threshold functions, T2D example, data and axes as in Fig 1: Calculations use an inverse- gamma model for σ 2 . Forty two threshold functions are shown, ranging in α values from a small positive value (red) just including the first data point up to α =0.10 (blue). (Most data points are truncated by the plot, as in Fig 1; also, the grid is uniform on the scale of log 2 [- log 2 (α)].) Units associated with a smaller α (i.e., more red) are ranked more highly by the given ranking method. Two units landing on the same curve would be ranked in the same position. r-values have a distinct signal/noise trade-off r-values reduce the differential uncertainty artifact σ 0 1 2 3 σ 0 1 2 3 σ 0 1 2 3 p {σ i |E (i |X i , σ i ) e 0.1 } σ 0 1 2 3 p (σ i | p-value i p 0.1 ) p (σ i | X i x 0.1 ) p(σ i |r (D i ) 0.1) Measurement model simulation with σ i and i independent r-values are Bayes rules under a continuum of losses and constraints L α (a, θ i )=1 - 1(a α, θ i θ α ) P {δ (D i ) α} = α. loss constraint p(σ i ) Note: ranking by posterior expected rank is essentially the same as ranking by posterior mean Thursday, July 10, 14

Putting lots of things in order: R-VALUESpages.stat.wisc.edu/~newton/talks/cancun.pdf · Putting lots of things in order: R-VALUES Free Throw Ability density 0.2 0.4 0.6 0.8 1.0 0

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Putting lots of things in order: R-VALUESpages.stat.wisc.edu/~newton/talks/cancun.pdf · Putting lots of things in order: R-VALUES Free Throw Ability density 0.2 0.4 0.6 0.8 1.0 0

●●●●●●●●●●●●●

● ●● ●●●●● ●●● ●●● ●● ●● ● ●●●● ● ● ●●●●● ● ●● ●● ● ●● ●●● ● ●● ●●●●● ● ●● ●● ● ●●● ●●●● ●●● ●● ● ●●●● ●●●●●●● ●● ●● ●●● ●● ●●●● ●● ● ●● ●●● ● ●● ●●● ● ●●● ●●● ●●●●● ● ●●● ●●● ● ●●●● ●● ●●●● ● ●●●● ●●●●●● ●● ● ●● ●● ● ●● ● ●● ● ●●● ● ●●● ● ●●● ●●● ●●●● ●●● ●● ● ●● ●● ●● ● ●● ● ●● ●● ●●●● ●● ●● ●● ●● ●● ● ●●● ●● ●●●●●●●●● ●●●●● ●● ● ●● ●●● ●● ●● ●● ●● ● ●● ●● ●●●● ●●● ●● ●● ●●●● ● ● ●● ●●●●●● ●● ●●●●●●● ● ●●● ●● ● ●●● ●● ●●● ●●●●● ● ●● ●●● ● ● ●●●●●●●● ●●● ●● ● ●●●● ●● ●● ●●●● ●● ● ● ●● ●● ● ●●●● ●● ●●● ●●●● ●●● ●● ● ●●●●● ●● ● ●●●●●●● ● ●●● ●● ●●● ●● ●●●● ●● ●●●● ● ●● ●●

●●●●●●●●●●● ●● ● ●●● ●● ●

●●●●

●●

●●

●●●● ●●●

1 5 10 50 100 500 1000

0.0

0.2

0.4

0.6

0.8

1.0

# Free Throw Attempts

Free

Thr

ow P

erce

ntag

e

Sports statistics 2013-14 NBA

461 players58029 free throw attempts

espn.go.com

Gene set analysis

● ●●

●●

●●

●●

●●

●●

●●

● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●●

●●

●●

● ●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●

●●

● ●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●● ●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

● ●

●●

●●●●●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●● ●

●●

● ● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●●

● ●

●●

●● ●

●●

●●

●●●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●● ●●

●●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●●●

●●●

●●

●●

● ●

● ●

●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●

●●

●●

●●

●●●

●●●●

●●●

●●●

●● ●

●●●

●●

● ●

●●

●●

●●

●●●●●

●● ●

●●●

●●

●●

●●●

● ●●

●● ●

●●

●●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

● ●

● ●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

● ●

●●

●●●●

●●●

●●●●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

● ●

●●

●●

●●

● ●

●●

●●●

●●

●●●

● ●

●●

●●

●● ● ●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●●●

●●●

●●● ●

●●●●

●●

●●

● ●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●●

●●

● ●●

●●

●●●●

●●●

●●

●●

●●

● ●

●●

● ●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●●

●●

●●

●●●

● ●

●●

●●

●●

●●

● ●

●●

●● ●●

●●

●●

● ●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

● ●●

●●

●● ●

● ●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●●

● ●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

● ●

●●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●●●●

●●

●●

●●

● ● ●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●● ●

●●●

● ●

● ●

●●●

●●

●●

●●

●●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

● ●●

●●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●●

● ●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●

● ●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

● ●

●● ●

● ●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

● ●●

●●●●●

●●

●● ●

●●

●●

●●

●●

● ●

●●

●● ●

●●

●●

●●

●●

●●

●●

● ●

● ●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●● ●●● ●

●●

●●

●● ●

●●● ●

●● ●

●●●● ●● ●

●●● ●●● ●●●

●●●●●●●●●● ●

●●●

●●

●●● ●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●●●

●●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

● ●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

● ●● ●

●●

●●

●●

● ●

●●●

● ●

●●

●●

● ●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●0.01 0.02 0.05 0.10

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Standard Error

Log

Odd

s

10 20 50 100 200 500 1000

0.05

0.10

0.20

0.50

set size N

prop

ortio

n of

set d

etec

ted

by R

NA

i

Genome-wide association studies

Examples

Intuitively, the r-value for unit i is the smallest α such that when ranking units by the posterior probability of being in the top α fraction of the system, unit i remains in the top α fraction of the ranked list.

• setting: large scale, non-sparse inference• examples: genomics/sports/evaluations/...• task: rank order the units• challenge: differential uncertainty

• lots of units: • data:• parameters (of interest): • model: p(Di|✓i)

{✓i}{Di}{i}

Putting lots of things in order: R-VALUES

Free Throw Ability

dens

ity

0.2 0.4 0.6 0.8 1.0

05

1015

2535

α

exce

edan

ce p

roba

bilit

y

0.002 0.005 0.010 0.020 0.050 0.100 0.200 0.500 1.0000

0.05

0.1

0.2

0.4

0.8

1

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●●●●

●●●●●●●●●●●●●●●

P( θi ≥ θα | Di )two examplesempirical quantileλ̂αr−value

DRay.Allen = 105 116

DLeBron.James = 439 585

Improved ranking and selection 25

Table 2. Leading free-throw shooters, 2013-2014 regular season of the National BasketballAssociation. From 461 players who attempted at least one free throw, shown are the top 25players as inferred by r-value. Data Di on player i include the number of made free throws yiand the number of attempts ni. Other columns indicate free-throw percentage FTP= yi/ni,which is the maximum likelihood estimate (MLE) of the underlying ability !i; posterior meanE(!i|Di), r-value inf{" : P (!i ! !!|Di) ! #!}; qualified rank, Q.R, which is the rank of FTPamongst players for whom yi ! 125; and ranks associated with the MLE, posterior mean, andr-value.player i yi ni FTP PM RV Q.R MLE.R PM.R RV.RBrian Roberts 125 133 0.940 0.913 0.002 1 17 1 1Ryan Anderson 59 62 0.952 0.898 0.003 15 2 2Danny Granger 63 67 0.940 0.893 0.005 16 3 3Kyle Korver 87 94 0.926 0.892 0.008 19 4 4Mike Harris 26 27 0.963 0.866 0.010 14 15 5J.J. Redick 97 106 0.915 0.886 0.011 22 6 6Ray Allen 105 116 0.905 0.880 0.016 25 8 7Mike Muscala 14 14 1.000 0.844 0.017 7 34 8Dirk Nowitzki 338 376 0.899 0.891 0.018 2 30 5 9Trey Burke 102 113 0.903 0.877 0.018 28 9 10Reggie Jackson 158 177 0.893 0.877 0.024 3 32 11 11Kevin Martin 303 340 0.891 0.882 0.025 4 33 7 12Gary Neal 94 105 0.895 0.869 0.025 31 14 13D.J. Augustin 201 227 0.885 0.873 0.031 5 38 12 14Stephen Curry 308 348 0.885 0.877 0.031 6 39 10 15Patty Mills 73 82 0.890 0.860 0.032 34 19 16Courtney Lee 99 112 0.884 0.861 0.035 40 18 17Steve Nash 22 24 0.917 0.834 0.039 20.5 44 18Greivis Vasquez 95 108 0.880 0.857 0.040 41 22 19Robbie Hummel 15 16 0.938 0.825 0.043 18 55 20Mo Williams 78 89 0.876 0.850 0.046 42 24 21Kevin Durant 703 805 0.873 0.870 0.048 7 45 13 22Aaron Brooks 83 95 0.874 0.850 0.049 44 26 23Damian Lillard 371 426 0.871 0.865 0.050 8 47 16 24Nando de Colo 31 35 0.886 0.831 0.057 37 48 25

Construction of ranking variables, NBA example

Free Throw Ability

dens

ity

0.2 0.4 0.6 0.8 1.0

05

1015

2535

✓↵

●● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5 10 15 20 25

0.00

0.05

0.10

0.15

0.20

0.25

0.30

t = rank from top

E[ s

imila

rity_

t{ R

anks

(thet

a) ,

Ran

ks.h

at[m

idse

ason

] } |

com

plet

e se

ason

]

● ● ●

●●

●●

●●

●●

● ●●

●●

●●

●● ●

● ● ●

● ● ● ● ● ● ● ● ● ● ●●

●●

●●

● ● ● ● ● ● ● ● ● ● ● ●

r−valueposterior meanMLE

1. local posteriors 2. estimated marginal

3. posterior exceedance probability 4. r-value

p(✓i|Di) p(✓i)

r(Di) = inf{↵ : P (✓i � ✓↵|Di) � �↵}

P {P (✓i � ✓↵|Di) � �↵} = ↵

binomial likelihoodbeta priorbeta posteriors

beta (parametric case)

r-values rank players better from mid-season data, as validated on complete season data.

r = 0.016

R-value ranking maximizes agreement between the true top α fraction and

the reported top α fraction, for allα.

R package: rvalueshttp://www.stat.wisc.edu/~newton/

Henderson, N.C. and Newton, M.A. (2013). Making the cut: improved ranking and selection for large-scale inference. arXiv:1312.5776

M.A. Newton and N.C. HendersonUniversity of Wisconsin, Madison, USA

Di = (Xi,�2i )

E(Xi|✓i,�2i ) = ✓i

var(Xi|✓i,�2i ) = �2

iV↵(x,�

2) = P (✓i � ✓↵|Xi = x,�

2i = �

2)

Assume V↵(x,�2) is right-continuous and non-decreasing in x for every ↵ and �

2

Measurement model:

Assume V↵(x,�2) is right-continuous and non-decreasing in x for every ↵ and �

2

Theorem: P {r(Di) ↵, ✓i � ✓↵} � P {T (Di) ↵, ✓i � ✓↵}for any other ranking variable T (Di), P [T (Di) ↵] ↵

Improved ranking and selection 17

0.01 0.02 0.05 0.100.05

0.10

0.15

0.20

0.25

0.30

Standard Error

Log

Odd

s

Fig. 1. Type-2 diabetes example: From the full complement of 127,903 SNPs used in the secondstage meta-analysis from Morris et al. 2012, we filtered to a reduced set of 25,558 SNPs that areprobably associated with T2D, and plot 3371 of those having highest observed association (log oddsexceeding 0.05). These estimates are based on genotype data from 22,669 T2D cases and 58,119control subjects.

18 Henderson and Newton

10 20 50 100 200 500 1000

0.05

0.10

0.20

0.50

set size N

prop

ortio

n of

set d

etec

ted

by R

NA

i

Fig. 2. RNAi example: From a recent version of Gene Ontology, 5719 terms (gene sets) annotatebetween 10 and 1000 human genes. Shown is a summary of the integration of these terms with thelist of 984 genes detected by RNAi as being involved in influenza virus replication (from Hao et al.2013). The x-axis shows set size and the y-axis shows the proportion of the set that was detectedby RNAi. The plot is restricted to 3626 sets for which the observed proportion exceeds 5%.

Improved ranking and selection 15

Hao, L., Q. He, Z. Wang, M. Craven, M. A. Newton, and P. Ahlquist (2013). Limitedagreement of independent RNAi screens for virus-required host genes owes more to false-negative than false-positive factors. PLoS computational biology 9 (9), e1003235.

Jost, J. and X. Li-Jost (1998). Calculus of variations, Volume 64. Cambridge UniversityPress.

Kass, R. E. and A. E. Raftery (1995). Bayes factors. Journal of the American StatisticalAssociation 90 (430), pp. 773–795.

Kendziorski, C., M. Newton, H. Lan, and M. Gould (2003). On parametric empiricalBayes methods for comparing multiple groups using replicated gene expression profiles.Statistics in medicine 22 (24), 3899–3914.

Laird, N. M. and T. A. Louis (1989). Empirical Bayes ranking methods. Journal of Edu-cational and Behavioral Statistics 14 (1), 29–46.

Lehmann, E. (1986). Testing statistical hypotheses (2nd ed.). Wiley series in probabilityand mathematical statistics: Probability and mathematical statistics. Wiley.

Leng, N., J. A. Dawson, J. A. Thomson, V. Ruotti, A. I. Rissman, B. M. Smits, J. D.Haag, M. N. Gould, R. M. Stewart, and C. Kendziorski (2013). EBSeq: an empiricalbayes hierarchical model for inference in rna-seq experiments. Bioinformatics 29 (8),1035–1043.

Lin, R., T. A. Louis, S. M. Paddock, and G. Ridgeway (2006). Loss function based rankingin two-stage, hierarchical models. Bayesian Analysis 1 (4), 915–946.

McCarthy, D. J. and G. K. Smyth (2009). Testing significance relative to a fold-changethreshold is a TREAT. Bioinformatics 25 (6), 765–771.

Morris, A. P., B. F. Voight, T. M. Teslovich, T. Ferreira, A. V. Segre, V. Steinthorsdottir,R. J. Strawbridge, H. Khan, H. Grallert, A. Mahajan, et al. (2012). Large-scale associationanalysis provides insights into the genetic architecture and pathophysiology of type 2diabetes. Nature genetics 44 (9), 981–990.

Niemi, J. (2010). Evaluating individual player contributions in basketball. In JSM Proceed-ings, Statistical Computing Section, Alexandria, VA, pp. 4914–4923. American StatisticalAssociation.

Noma, H., S. Matsui, T. Omori, and T. Sato (2010). Bayesian ranking and selection methodsusing hierarchical mixture models in microarray studies. Biostatistics 11 (2), 281–289.

Normand, S.-L. T., M. E. Glickman, and C. A. Gatsonis (1997). Statistical methods forprofiling providers of medical care: issues and applications. Journal of the AmericanStatistical Association 92 (439), 803–814.

Paddock, S. M. and T. A. Louis (2011). Percentile-based empirical distribution functionestimates for performance evaluation of healthcare providers. Journal of the Royal Sta-tistical Society: Series C (Applied Statistics) 60 (4), 575–589.

Improved ranking and selection 15

Hao, L., Q. He, Z. Wang, M. Craven, M. A. Newton, and P. Ahlquist (2013). Limitedagreement of independent RNAi screens for virus-required host genes owes more to false-negative than false-positive factors. PLoS computational biology 9 (9), e1003235.

Jost, J. and X. Li-Jost (1998). Calculus of variations, Volume 64. Cambridge UniversityPress.

Kass, R. E. and A. E. Raftery (1995). Bayes factors. Journal of the American StatisticalAssociation 90 (430), pp. 773–795.

Kendziorski, C., M. Newton, H. Lan, and M. Gould (2003). On parametric empiricalBayes methods for comparing multiple groups using replicated gene expression profiles.Statistics in medicine 22 (24), 3899–3914.

Laird, N. M. and T. A. Louis (1989). Empirical Bayes ranking methods. Journal of Edu-cational and Behavioral Statistics 14 (1), 29–46.

Lehmann, E. (1986). Testing statistical hypotheses (2nd ed.). Wiley series in probabilityand mathematical statistics: Probability and mathematical statistics. Wiley.

Leng, N., J. A. Dawson, J. A. Thomson, V. Ruotti, A. I. Rissman, B. M. Smits, J. D.Haag, M. N. Gould, R. M. Stewart, and C. Kendziorski (2013). EBSeq: an empiricalbayes hierarchical model for inference in rna-seq experiments. Bioinformatics 29 (8),1035–1043.

Lin, R., T. A. Louis, S. M. Paddock, and G. Ridgeway (2006). Loss function based rankingin two-stage, hierarchical models. Bayesian Analysis 1 (4), 915–946.

McCarthy, D. J. and G. K. Smyth (2009). Testing significance relative to a fold-changethreshold is a TREAT. Bioinformatics 25 (6), 765–771.

Morris, A. P., B. F. Voight, T. M. Teslovich, T. Ferreira, A. V. Segre, V. Steinthorsdottir,R. J. Strawbridge, H. Khan, H. Grallert, A. Mahajan, et al. (2012). Large-scale associationanalysis provides insights into the genetic architecture and pathophysiology of type 2diabetes. Nature genetics 44 (9), 981–990.

Niemi, J. (2010). Evaluating individual player contributions in basketball. In JSM Proceed-ings, Statistical Computing Section, Alexandria, VA, pp. 4914–4923. American StatisticalAssociation.

Noma, H., S. Matsui, T. Omori, and T. Sato (2010). Bayesian ranking and selection methodsusing hierarchical mixture models in microarray studies. Biostatistics 11 (2), 281–289.

Normand, S.-L. T., M. E. Glickman, and C. A. Gatsonis (1997). Statistical methods forprofiling providers of medical care: issues and applications. Journal of the AmericanStatistical Association 92 (439), 803–814.

Paddock, S. M. and T. A. Louis (2011). Percentile-based empirical distribution functionestimates for performance evaluation of healthcare providers. Journal of the Royal Sta-tistical Society: Series C (Applied Statistics) 60 (4), 575–589.

Improved ranking and selection 21

N[ok]

MLE

MLEN[ok]

MLE

p−value

N[ok]

MLE

posterior mean

1

1/4

1/16

0

−1/16

−1/4

−1

(X−R)/(X+R)

Fig. 5. Ranking via various methods compared to r-value ranking; RNAi example; data and axes asin Fig. 2. Plotted is (X ! R)/(X + R) where X is the rank (from the top) of the set by the methodbeing compared, and R is the rank by r-value.

Improved ranking and selection 19

a. MLE b. p−value

c. posterior mean d. maximal agreement

Fig. 3. Threshold functions, T2D example, data and axes as in Fig 1: Calculations use an inverse-gamma model for !2. Forty two threshold functions are shown, ranging in " values from a smallpositive value (red) just including the first data point up to " = 0.10 (blue). (Most data points aretruncated by the plot, as in Fig 1; also, the grid is uniform on the scale of log2[! log2(")].) Unitsassociated with a smaller " (i.e., more red) are ranked more highly by the given ranking method.Two units landing on the same curve would be ranked in the same position.

r-values have a distinct signal/noise trade-off

r-values reduce the differential uncertainty artifact

σ

0 1 2 3σ

0 1 2 3σ

0 1 2 3

p {�i|E(✓i|Xi,�i) � e0.1 }

σ

0 1 2 3

p (�i| p-valuei p0.1 ) p (�i|Xi � x0.1 )p(�i|r(Di) 0.1)

Measurement model simulation with �i and ✓i independent

r-values are Bayes rules under a continuum of losses and constraints

8 Henderson and Newton

according to mid-season data and rank(!i) is his unknown true rank.R-values may be computed in all sorts of hierarchical modeling e!orts, including semi-

parametric models and cases where Markov chain Monte Carlo (MCMC) is used to approx-imate the marginal posterior distribution of each !i given available data. Figure 6 comparesthe r-value ranking with other rankings in an example from gene-expression analysis, whereevidence suggested that the expression of a large fraction of the human genome was associ-ated with the status of a certain viral infection (Pyeon, et al., 2007). A multi-level modelinvolving both null and non-null genes as well as t!distributed non-null e!ects !i exhibitedgood fit to the data, but did not admit a closed form for V!(Di). R-values, computed usingMCMC output, again reveal systematic ranking di!erences from other approaches.

Multi-level models drive statistical inference and software in a variety of genomic do-mains: for example, limma (Smyth, 2004), EBarrays (Kendziorski et al. 2003), EBSeq (Lenget al. 2013), among others. Since these models happen to specify distributional forms forparameters of interest, the associated code could be augmented to compute posterior tailprobabilities V!(Di) and thus r-values for ranking. The limma system utilizes a conjugatenormal, inverse-gamma model, and so V!(Di) involves the tail probability of a non-centralt distribution. The EBSeq system entails a conjugate beta, negative-binomial model, andso V!(Di) for di!erential expression involves tail probabilities in a certain ratio distribution(Coelho and Mexia, 2007). One expects the benefits of r-value computation to show espe-cially in cases involving many non-null units and relatively high variation among units intheir variance parameters (e.g., sequence read depth).

3. Connections

3.1. Connection to Bayes ruleThe proposed r-values are not Bayes rules in the usual sense, however there is a connectionto Bayesian inference if one allows both a continuum of loss functions and a distributionalconstraint on the reported unit-specific (relative) ranks. To see this connection, we introducea collection of loss functions

L!(a, !i) = 1! 1 (a " ", !i # !!)

where action a is a relative rank value in (0, 1), " $ (0, 1) indexes the collection, and again!! = F!1(1 ! ") is a quantile in the population of interest. Specifically, no "!loss occursif the inferred relative rank a and the actual relative rank 1 ! F (!i) both are less than ".The marginal (pre-posterior) Bayes risk of rule #(Di) is

risk! = 1! P {#(Di) " " , ! # !!} , (9)

which is one minus the agreement (2). In the absence of other considerations, the Bayes rulefor loss L! degenerates to #(Di) = 0. Degeneration is avoided if we enforce on the reportedrank the additional structure that it share with the true relative rank 1!F (!i) the propertyof being uniformly distributed over the population of units. Such a constrained Bayes rulethen minimizes the modified objective function:

risk! + $!P {#(Di) " "}

where $! is chosen to enforce the (marginal) size constraint P {#(Di) " "} = ".

8 Henderson and Newton

according to mid-season data and rank(!i) is his unknown true rank.R-values may be computed in all sorts of hierarchical modeling e!orts, including semi-

parametric models and cases where Markov chain Monte Carlo (MCMC) is used to approx-imate the marginal posterior distribution of each !i given available data. Figure 6 comparesthe r-value ranking with other rankings in an example from gene-expression analysis, whereevidence suggested that the expression of a large fraction of the human genome was associ-ated with the status of a certain viral infection (Pyeon, et al., 2007). A multi-level modelinvolving both null and non-null genes as well as t!distributed non-null e!ects !i exhibitedgood fit to the data, but did not admit a closed form for V!(Di). R-values, computed usingMCMC output, again reveal systematic ranking di!erences from other approaches.

Multi-level models drive statistical inference and software in a variety of genomic do-mains: for example, limma (Smyth, 2004), EBarrays (Kendziorski et al. 2003), EBSeq (Lenget al. 2013), among others. Since these models happen to specify distributional forms forparameters of interest, the associated code could be augmented to compute posterior tailprobabilities V!(Di) and thus r-values for ranking. The limma system utilizes a conjugatenormal, inverse-gamma model, and so V!(Di) involves the tail probability of a non-centralt distribution. The EBSeq system entails a conjugate beta, negative-binomial model, andso V!(Di) for di!erential expression involves tail probabilities in a certain ratio distribution(Coelho and Mexia, 2007). One expects the benefits of r-value computation to show espe-cially in cases involving many non-null units and relatively high variation among units intheir variance parameters (e.g., sequence read depth).

3. Connections

3.1. Connection to Bayes ruleThe proposed r-values are not Bayes rules in the usual sense, however there is a connectionto Bayesian inference if one allows both a continuum of loss functions and a distributionalconstraint on the reported unit-specific (relative) ranks. To see this connection, we introducea collection of loss functions

L!(a, !i) = 1! 1 (a " ", !i # !!)

where action a is a relative rank value in (0, 1), " $ (0, 1) indexes the collection, and again!! = F!1(1 ! ") is a quantile in the population of interest. Specifically, no "!loss occursif the inferred relative rank a and the actual relative rank 1 ! F (!i) both are less than ".The marginal (pre-posterior) Bayes risk of rule #(Di) is

risk! = 1! P {#(Di) " " , ! # !!} , (9)

which is one minus the agreement (2). In the absence of other considerations, the Bayes rulefor loss L! degenerates to #(Di) = 0. Degeneration is avoided if we enforce on the reportedrank the additional structure that it share with the true relative rank 1!F (!i) the propertyof being uniformly distributed over the population of units. Such a constrained Bayes rulethen minimizes the modified objective function:

risk! + $!P {#(Di) " "}

where $! is chosen to enforce the (marginal) size constraint P {#(Di) " "} = ".

loss constraint

p(�i)

Note: ranking by posterior expected rank is essentially the same as ranking by posterior mean

Thursday, July 10, 14