40
Profile Analysis

Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let 1, 2, …, p denote the means of these variables

Embed Size (px)

Citation preview

Page 1: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Profile Analysis

Page 2: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Definition

• Let X1, X2, … , Xp denote p jointly distributed variables under study

• Let 1, 2, … , p denote the means of these variables denote the means these variables

• The profile of these variables is a plot of i vs i.

i

i

Page 3: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

The multivariate TestLet denote a sample of n from the p-variate normal distribution with mean vector and covariance matrix .

1 2, , , nx x x

x

0 : vs

:

x y

A x y

H

H

Suppose we want to test

Let denote a sample of m from the p-variate normal distribution with mean vector and covariance matrix .

1 2, , , my y y

y

Page 4: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Hotelling’s T2 statistic for the two sample problem

2 111 1 pooledT x y x y

n m

S

if H0 is true than

21

2

n m pF T

p n m

has an F distribution with 1 = p and

2 = n +m – p - 1

1 1

2 2pooled x y

n m

n m n m

S S S

Page 5: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Profile ComparisonX

variables

p1 2 3 …

Group A

Group B

Page 6: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Hotelling’s T2 test, tests

0 : Equality of ProfilesH

against

: Different profilesAH

Page 7: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Profile Analysis

Page 8: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Parallelism

Page 9: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

1 2 3 …

Variables not interacting with groups(parallelism)

X

variables

p

groups

Page 10: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Variables interacting with groups(lack of parallelism)X

variables

p1 2 3 …

groups

Page 11: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Parallelism

• Group differences are constant across variables

Lack of Parallelism

• Group differences are variable dependent

• The differences between groups is not the same for each variable

Page 12: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Test for parallelism

Page 13: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Let denote a sample of n from the p-variate normal distribution with mean vector and covariance matrix .

1 2, , , nx x x

x

Let denote a sample of m from the p-variate normal distribution with mean vector and covariance matrix .

1 2, , , my y y

y

Page 14: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Let 1

1 1 0 0 0 0

0 1 1 0 0 0

0 0 1 1 0 0

0 0 0 1 0 0

0 0 0 0 1 1

p pC

Then1 2

12 3

1p

p p

X XX

X XCX C

XX X

Page 15: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Consider the data

This is a sample of n from the (p -1) -variate normal distribution with mean vector and covariance matrix .

1 2, , , nCx Cx Cx

xC

0 : vs

:

x y

A x y

H C C

H C C

The test for parallelism is

Also

is a sample of m from the (p -1) -variate normal distribution with mean vector and covariance matrix .

1 2, , , mCy Cy Cy

yC

C C

C C

Page 16: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Hotelling’s T2 test for parallelism

12pooled

nmT Cx Cy C C Cx Cy

n m

S

if H0 is true than

2

1 2

n m pF T

p n m

has an F distribution with 1 = p – 1 and

2 = n +m – p

Thus we reject H0 if F > Fwith 1 = p – 1 and

2 = n +m – p

Page 17: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

To perform the test for parallelism, compute differences of successive variables for each case in each group and perform the two-sample Hotelling’s T2 test.

Page 18: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Test for Equality of Groups

(Parallelism assumed)

Page 19: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

1 2 3 …

Groups equal

X

variables

p

groups

Page 20: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

If parallelism is proven:

It is appropriate to test for equality of profiles

1 10 1 1

1 11 1

: vs

:

x xp y ypp p

A x xp y ypp p

H

H

1 10

1 1

: vs

:

x yp p

A x yp p

H

H

1 1

1 1

i.e.

Page 21: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

The t test

1 1

1

p p

pooled pooledp

x y x ynm nmt

n m n m

1 1 1 1

1 S 1 1 S 1

Thus we reject H0 if |t| > t/2with df = = n +m - 2

To perform this test, average all the variables for each case in each group and perform the two-sample t-test.

Page 22: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Test for equality of variables

(Parallelism Assumed)

Page 23: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Variables equalX

variables

i1 2 3 …

groups

Page 24: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Let 1

1 1 0 0 0 0

0 1 1 0 0 0

0 0 1 1 0 0

0 0 0 1 0 0

0 0 0 0 1 1

p pC

Then1 2

12 3

1p

p p

X XX

X XCX C

XX X

Page 25: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Consider the data

This is a sample of n from the p-variate normal distribution with mean vector and covariance matrix .

1 2, , , nCx Cx Cx

xC

0 : vs

:

x

A x

H C

H C

0

0

The test for equality of variables for the first group is:

C C

Page 26: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Hotelling’s T2 test for equality of variables

12pooledT n Cx C C Cx

0 S 0

if H0 is true than

Thus we reject H0 if F > Fwith 1 = p – 1 and

2 = n – p + 1

1

pooledn Cx C C Cx S

21

1 1

n pF T

p n

has an F distribution with 1 = p – 1 and 2 = n - p + 1

Page 27: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

To perform the test, compute differences of successive variables for each case in the group and perform the one-sample Hotelling’s T2 test for a zero mean vector

A similar test can be performed for the second sample.

Both of these tests do not assume parllelism.

Page 28: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Then

This is a sample of n + m from the p-variate normal distribution with mean vector and covariance matrix .

1 2 1 2, , , , , , ,n mCx Cx Cx Cy Cy Cy

x yC C

If parallelism is assumed then

C C

0 : vs

:

x x

A x x

H C C

H C C

0

0

The test for equality of variables is:

Page 29: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Hotelling’s T2 test for equality of variables

12 1pooledT nCx mCy C C nCx mCy

n m

S

if H0 is true than

Thus we reject H0 if F > Fwith 1 = p – 1 and

2 = n + m – p

2

1 2

n m pF T

p n m

has an F distribution with 1 = p – 1 and 2 = n +m - p

Page 30: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

To perform this test for parallelism,

1. Compute differences of successive variables for each case in each group

2. Combine the two samples into a single sample of n + m and

3. Perform the single-sample Hotelling’s T2 test for a zero mean vector.

Page 31: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Example

• Two groups of Elderly males• Groups

1. Males identified with no senile factor2. Males identified with a senile factor

• Variables – Scores on WAIS (intelligence) test1. Information2. Similarities3. Arithmetic4. Picture completion

Page 32: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Summary Statistics

no senile factor senile factorn 1 = 37 n 2=12

Information 12.57 8.75Arithmetic 9.57 5.33Similarities 11.49 8.5Picture Completion 7.97 4.75

Group

Subtest

11.2624 9.406 7.155 3.379113.5265 7.34784 2.5014

11.5796 2.61675.83133

S =

Page 33: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Hotellings T2 test (2 sample)

2 1

1 2

22.13,

5.18, 4, 44

pooled

nmT x y x y

n mF

S

H0 :equal means, is rejected

Page 34: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Profile Analysis

0

2

4

6

8

10

12

14

Information Arithmetic Similarities PictureCompletion

no senile factor

senile factor

Page 35: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Hotelling’s T2 test for parallelism

12 1.464pooled

nmT Cx Cy C C Cx Cy

n m

S

2 37 12 4 45

1.464 1.464 0.471 2 4 1 37 12 2 3 47

n m pF T

p n m

Decision: Accept H0 : parallelism

Page 36: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

The t test for equality of groups assuming parallelism

1 1

1

p p

pooled pooledp

x y x ynm nmt

n m n m

1 1 1 1

1 S 1 1 S 1

Thus we reject H0 if t > twith df = = n +m - 2 = 47

37 12 41.6 27.334.15

37 12 107.06

Page 37: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Hotelling’s T2 test for equality of variables

12 1pooledT nCx mCy C C nCx mCy

n m

S

Thus we reject H0 if F > Fwith 1 = p – 1= 3 and

2 = n + m – p = 45

2 53.35

1 2

n m pF T

p n m

167.15

F0.05= 6.50 if 1 = 3 and 2 = 45

Page 38: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Example 2: Profile Analysis for Manova

In the following study, n = 15 first year university students from three different School regions (A, B and C) who were each taking the following four courses (Math, biology, English and Sociology) were observed: The marks on these courses is tabulated on the following slide:

Page 39: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Student Math Biology English Sociology Student Math Biology English Sociology Student Math Biology English Sociology1 62 65 67 76 1 65 55 35 43 1 47 47 98 782 54 61 75 70 2 87 81 59 64 2 57 69 68 453 53 53 53 59 3 75 67 56 68 3 65 71 77 624 48 56 73 81 4 74 70 55 66 4 41 64 68 585 60 55 49 60 5 83 71 40 52 5 56 54 86 646 55 52 34 41 6 59 48 48 57 6 63 73 88 767 76 71 35 40 7 61 47 46 54 7 43 62 84 788 58 52 58 46 8 81 77 51 45 8 28 47 65 589 75 71 60 59 9 77 68 42 49 9 47 54 90 78

10 55 51 69 75 10 82 84 63 70 10 42 44 79 7311 72 74 64 59 11 68 64 35 44 11 50 53 89 8912 72 75 51 47 12 60 53 60 65 12 46 61 91 8213 76 69 69 57 13 94 88 51 63 13 74 78 99 8614 44 48 65 65 14 96 88 67 81 14 63 66 94 8615 89 71 59 67 15 84 75 46 67 15 69 82 78 73

Educational RegionA B C

The data

Page 40: Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables

Summary Statistics

63.267 61.600 58.733 60.133

160.638 104.829 -32.638 -47.110104.829 92.543 -4.900 -22.229-32.638 -4.900 155.638 128.967-47.110 -22.229 128.967 159.552

Ax

A S

Bx

B S

Cx

C S

76.400 69.067 50.267 59.200

141.257 155.829 45.100 60.914155.829 185.924 61.767 71.05745.100 61.767 96.495 93.37160.914 71.057 93.371 123.600

52.733 61.667 83.600 72.400

156.067 116.976 53.814 35.257116.976 136.381 3.143 -0.42953.814 3.143 116.543 114.88635.257 -0.429 114.886 156.400

15 15 15

45 45 45A B Cx x x x

14 14 14

42 42 42Pooled A B C S S S S

64.133 64.111 64.200 63.911

152.654 125.878 22.092 16.354125.878 138.283 20.003 16.133

22.092 20.003 122.892 112.40816.354 16.133 112.408 146.517