Final Project - Psychological Sciences … · Final Project meet with me over the next week or so to discuss possibilities

Final Project meet with me over the next week or

so to discuss possibilities

What is Likelihood?

What is Likelihood?

prob(data|parm) probability of some data given the known parameters of some model

L(data|parm) likelihood of known data given particular candidate parameters of the model

know parameters à predict some outcome in future data

observing some data à estimate some parameters that maximize the likelihood of the observed data

imagine an unfair coin that gives heads with probability p=.6 and tails with probability q=1-p=.4

what is the probability of getting x heads on N flips?

Prob(x | p) = Nx

!

"#

$

%& px (1' p)N'x = N!

x!(N ' x)!px (1' p)N'x

Binomial Distribution f(x; p) give the probability of observing x “successes” for a Bernoulli process with probability p

probability of getting any particular combination of x heads and N-x tails

number of ways of getting x heads and N-x tails

imagine we flip a coin 10 times and get 4 heads (N=10, x=4) what is the maximum likelihood estimate of p?

L(x | p) = Prob(x | p) = Nx

!

"#

$

%& px (1' p)N'x = N!

x!(N ' x)!px (1' p)N'x

now, x (and N) are fixed we want to find the value of p that maximizes the likelihood L of the data

Matlab week5.m

imagine we flip a coin 10 times and get 4 heads (N=10, x=4) what is the maximum likelihood estimate of p? USING CALCULUS


!

"#

$

%& px (1' p)N'x = N!

x!(N ' x)!px (1' p)N'x

lnL = log Nx

!

"#

$

%&

!

"##

$

%&&+ x ln p+ (N ' x)ln(1' p)

d lnLdp

= x 1p

!

"#

$

%&+ (N ' x) 1

1' p!

"#

$

%&('1) = 0

x(1' p)' (N ' x)p = 0x ' xp' Np+ xp = 0

p = xN



!

"#

$

%& px (1' p)N'x = N!

x!(N ' x)!px (1' p)N'x

lnL = log Nx

!

"#

$

%&

!

"##

$

%&&+ x ln p+ (N ' x)ln(1' p)

d lnLdp

= x 1p

!

"#

$

%&+ (N ' x) 1

1' p!

"#

$

%&('1) = 0

x(1' p)' (N ' x)p = 0x ' xp' Np+ xp = 0

p = xN



!

"#

$

%& px (1' p)N'x = N!

x!(N ' x)!px (1' p)N'x

lnL = log Nx

!

"#

$

%&

!

"##

$

%&&+ x ln p+ (N ' x)ln(1' p)

d lnLdp

= 0+ x 1p

!

"#

$

%&+ (N ' x) 1

1' p!

"#

$

%&('1) = 0

x(1' p)' (N ' x)p = 0x ' xp' Np+ xp = 0

p = xN



!

"#

$

%& px (1' p)N'x = N!

x!(N ' x)!px (1' p)N'x

lnL = log Nx

!

"#

$

%&

!

"##

$

%&&+ x ln p+ (N ' x)ln(1' p)

d lnLdp

= 0+ x 1p

!

"#

$

%&+ (N ' x) 1

1' p!

"#

$

%&('1) = 0

x(1' p)' (N ' x)p = 0x ' xp' Np+ xp = 0

p = xN

Now let’s try finding maximum likelihood parameters for a distribution you’ve never seen before. It’s called the Lamron Distribution:

Prob(x |!," 2 ) = 1" 2#

exp !(x !!)2

2" 2"

#$

%

&'

typically, we would want to know the probability of observing x (actually a range of x) given α and β


Prob(x |!," 2 ) = 1" 2#

exp !(x !!)2

2" 2"

#$

%

&'


let us assume instead that we have some observed data x1,x2,x3…xn and we want to find the maximum likelihood estimates for α and β


Prob(x1, x2,... |!,"2 ) = 1

" 2#exp !

(xi !!)2

2" 2"

#$

%

&'

i(




Prob(x1, x2,... |!,"2 ) = 1

" 2#exp !

(xi !!)2

2" 2"

#$

%

&'

i(




in other words, what values of parameters (α and β) make that observed set of data most likely

Matlab week5.m

Of course, this is just the Normal Distribution

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(µ,! 2; x1x2...xn ) =1

! 2"exp !

(xi !µ)2

2! 2

"

#$

%

&'

i=1

n

(

lnL = ! 12 ln(2" )! ln! !

(xi !µ)2

2! 2

)

*+

,

-.

i=1

n

/

0 lnL0µ

=1! 2 (xi !µ)

i=1

n

/ = 0

µ̂ =xi

i=1

n

/n

0 lnL0!

= !n!+1! 2 (xi !µ)

2

i=1

n

/ = 0

!̂ =(xi !µ)

2

i=1

n

/n

That’s the likelihood of observing one data point x. What about the likelihood of observing x1, x2, x3, …

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(µ,! 2; x1x2...xn ) =1

! 2"exp !

(xi !µ)2

2! 2

"

#$

%

&'

i=1

n

(

lnL = ! 12 ln(2" )! ln! !

(xi !µ)2

2! 2

)

*+

,

-.

i=1

n

/

0 lnL0µ

=1! 2 (xi !µ)

i=1

n

/ = 0

µ̂ =xi

i=1

n

/n

0 lnL0!

= !n!+1! 2 (xi !µ)

2

i=1

n

/ = 0

!̂ =(xi !µ)

2

i=1

n

/n

Recall that if p(x) is the probability of observing x, and if x1, x2, x3, are independent and identically distributed, then p(x1, x2, x3) = p(x1)p(x2)p(x3)

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(x1x2...xn |µ,!2 ) = 1

! 2"exp !

(xi !µ)2

2! 2

"

#$

%

&'

i=1

n

(

lnL = ! 12 ln(2" )! ln! !

(xi !µ)2

2! 2

)

*+

,

-.

i=1

n

/

0 lnL0µ

=1! 2 (xi !µ)

i=1

n

/ = 0

µ̂ =xi

i=1

n

/n

0 lnL0!

= !n!+1! 2 (xi !µ)

2

i=1

n

/ = 0

!̂ =(xi !µ)

2

i=1

n

/n

What is lnL?

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(x1x2...xn |µ,!2 ) = 1

! 2"exp !

(xi !µ)2

2! 2

"

#$

%

&'

i=1

n

(

lnL = ! 12 ln(2" )! ln! !

(xi !µ)2

2! 2

)

*+

,

-.

i=1

n

/

0 lnL0µ

=1! 2 (xi !µ)

i=1

n

/ = 0

µ̂ =xi

i=1

n

/n

0 lnL0!

= !n!+1! 2 (xi !µ)

2

i=1

n

/ = 0

!̂ =(xi !µ)

2

i=1

n

/n

L(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(x1x2...xn |µ,!2 ) = 1

! 2"exp !

(xi !µ)2

2! 2

"

#$

%

&'

i=1

n

(

lnL = ! 12 ln(2" )! ln! !

(xi !µ)2

2! 2

)

*+

,

-.

i=1

n

/

0 lnL0µ

=1! 2 (xi !µ)

i=1

n

/ = 0

µ̂ =xi

i=1

n

/n

0 lnL0!

= !n!+1! 2 (xi !µ)

2

i=1

n

/ = 0

!̂ =(xi !µ)

2

i=1

n

/n

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

What is the derivative wrt μ?

L(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(x1x2...xn |µ,!2 ) = 1

! 2"exp !

(xi !µ)2

2! 2

"

#$

%

&'

i=1

n

(

lnL = ! 12 ln(2" )! ln! !

(xi !µ)2

2! 2

)

*+

,

-.

i=1

n

/

0 lnL0µ

=1! 2 (xi !µ)

i=1

n

/ = 0

µ̂ =xi

i=1

n

/n

0 lnL0!

= !n!+1! 2 (xi !µ)

2

i=1

n

/ = 0

!̂ =(xi !µ)

2

i=1

n

/n

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

n

x

xnLn

x

xL

xL

xxxxL

xxL

n

ii

n

ii

n

ii

n

ii

n

i

i

n

i

in

∑

∑

∑

∑

∑

∏

=

=

=

=

=

=

−=

=−+−=∂∂

=

=−=∂∂

⎥⎦

⎤⎢⎣

⎡ −−−−=

⎟⎟⎠

⎞⎜⎜⎝

⎛ −−=

⎟⎟⎠

⎞⎜⎜⎝

⎛ −−=

1

2

1

22

1

12

12

2

21

12

2

212

2

22

)(ˆ

0)(1ln

ˆ

0)(1ln

2)(ln)2ln(ln

2)(exp

21)...;,(

2)(exp

21);,(

µσ

µσσσ

µ

µσµ

σµ

σπ

σµ

πσσµ

σµ

πσσµ

L(µ,! 2; x) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(µ,! 2; x1x2...xn ) =1

! 2"exp !

(xi !µ)2

2! 2

"

#$

%

&'

i=1

n

(

lnL = ! 12 ln(2" )! ln! !

(xi !µ)2

2! 2

)

*+

,

-.

i=1

n

/

0 lnL0µ

=1! 2 (xi !µ)

i=1

n

/ = 0

µ̂ =xi

i=1

n

/n

0 lnL0!

= !n!!122! 3 (xi !µ)

2

i=1

n

/ = 0

!̂ =(xi !µ)

2

i=1

n

/n

What is the derivative wrt σ?

Prob(x |µ,! 2 ) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

n

x

xnLn

x

xL

xL

xxxxL

xxL

n

ii

n

ii

n

ii

n

ii

n

i

i

n

i

in

∑

∑

∑

∑

∑

∏

=

=

=

=

=

=

−=

=−+−=∂∂

=

=−=∂∂

⎥⎦

⎤⎢⎣

⎡ −−−−=

⎟⎟⎠

⎞⎜⎜⎝

⎛ −−=

⎟⎟⎠

⎞⎜⎜⎝

⎛ −−=

1

2

1

22

1

12

12

2

21

12

2

212

2

22

)(ˆ

0)(1ln

ˆ

0)(1ln

2)(ln)2ln(ln

2)(exp

21)...;,(

2)(exp

21);,(

µσ

µσσσ

µ

µσµ

σµ

σπ

σµ

πσσµ

σµ

πσσµ

L(µ,! 2; x) = 1! 2"

exp !(x !µ)2

2! 2

"

#$

%

&'

L(µ,! 2; x1x2...xn ) =1

! 2"exp !

(xi !µ)2

2! 2

"

#$

%

&'

i=1

n

(

lnL = ! 12 ln(2" )! ln! !

(xi !µ)2

2! 2

)

*+

,

-.

i=1

n

/

0 lnL0µ

=1! 2 (xi !µ)

i=1

n

/ = 0

µ̂ =xi

i=1

n

/n

0 lnL0!

= !n!!122! 3 (xi !µ)

2

i=1

n

/ = 0

!̂ =(xi !µ)

2

i=1

n

/n

Consider the Exponential Distribution

A Poisson process is a “memoryless” process that generates random events over time. It’s “memoryless” in the sense that whether an event occurs on the next time instant does not depend on how long ago the last event occurred. Examples: ~ spikes in neurons, radioactive decay The exponential distribution give the distribution of times between events. The Poisson distribution gives you the distribution of the number of events within a given period of time. They’re complementary.

f (t | !) = ! exp(!! " t)

Consider the Exponential Distribution

f(t;λ) tells you the probability of observing an event given the rate λ. We want to figure out the rate λ given some data (t1,t2,…,tn)

f (t | !) = ! exp(!! " t)

L(t | !) = ! exp(!! " t)

L(!;t1, t2,..., tn ) = ! exp(!! " ti )i=1

n

#

L(!;t1, t2,..., tn ) = !n exp(!! ti

i=1

n

$ )

lnL = n ln! !! tii=1

n

$

% lnL%!

=n!! ti

i=1

n

$ = 0

!̂ =n

tii=1

n

$

L(t | !) = ! exp(!! " t)

L(t1, t2,..., tn | !) = ! exp(!! " ti )i=1

n

#

L(t1, t2,..., tn | !) = !n exp(!! ti

i=1

n

$ )


n

$

% lnL%!

=n!! ti

i=1

n

$ = 0

!̂ =n

tii=1

n

$

L(t | !) = ! exp(!! " t)

L(t1, t2,..., tn | !) = ! exp(!! " ti )i=1

n

#

L(t1, t2,..., tn | !) = !n exp(!! ti

i=1

n

$ )


n

$

% lnL%!

=n!! ti

i=1

n

$ = 0

!̂ =n

tii=1

n

$

L(t | !) = ! exp(!! " t)

L(t1, t2,..., tn | !) = ! exp(!! " ti )i=1

n

#

L(t1, t2,..., tn | !) = !n exp(!! ti

i=1

n

$ )


n

$

% lnL%!

=n!! ti

i=1

n

$ = 0

!̂ =n

tii=1

n

$

L(t | !) = ! exp(!! " t)

L(t1, t2,..., tn | !) = ! exp(!! " ti )i=1

n

#

L(t1, t2,..., tn | !) = !n exp(!! ti

i=1

n

$ )


n

$

% lnL%!

=n!! ti

i=1

n

$ = 0

!̂ =n

tii=1

n

$

L(t | !) = ! exp(!! " t)

L(t1, t2,..., tn | !) = ! exp(!! " ti )i=1

n

#

L(t1, t2,..., tn | !) = !n exp(!! ti

i=1

n

$ )


n

$

% lnL%!

=n!! ti

i=1

n

$ = 0

!̂ =n

tii=1

n

$

i.e., the average rate

L(t | !) = ! exp(!! " t)

L(t1, t2,..., tn | !) = ! exp(!! " ti )i=1

n

#

L(t1, t2,..., tn | !) = !n exp(!! ti

i=1

n

$ )


n

$

% lnL%!

=n!! ti

i=1

n

$ = 0

!̂ =n

tii=1

n

$

Think about a categorization (or identification) experiment

stimulus Si is categorized with response A or B

What is the data?

27 13 S1

A B

22 18 S2

.

.

.

8 32 Sk

what’s the likelihood of parameters of the GCM given this data?

L( fA|S1, fB|S1, fA|S2, fB|S2,..., fA|Sk, fB|Sk | parameters) =

L( fA|Si, fB|Si | parameters)i

k

! =

NSi

fA|Si

"

#

$$

%

&

''

i

k

! pA|SifA|Si (1( pA|Si )

NSi( fA|Si =

fA|Si + fB|SifA|Si fB|Si

"

#

$$

%

&

''

i

k

! pA|SifA|Si pB|Si

fB|Si

Think about a categorization (or identification) experiment

stimulus Si is categorized with response A, B, or C stimulus Si identified with response associated with stimulus 1, 2, …, n

What is the data?

34 13 3 Si

A B C

when presented with Si, there are 3 possible discrete outcomes – calling it A, B, or C

34 13 3 Si

A B C


Imagine that we knew perfectly the mechanism that was driving people’s categorization responses. This mechanism specifies p(A|Si), p(B|Si), and p(C|Si). If we had N=50 presentations of stimulus Si, we could figure out the probability of observed xA=34 A responses, xB=13 B responses, and xC=3 C responses.

Prob(xA, xB, xC; p(A|Si )p(B|Si )p(C|Si ) ) =N

xA xB xC

!

"##

$

%&& p(A|Si )

xA p(B|Si )xB p(C|Si )

xC

34 13 3 Si

A B C


Imagine that we knew perfectly the mechanism that was driving people’s categorization responses. This mechanism specifies p(A|Si), p(B|Si), and p(C|Si). If we had N=50 presentations of stimulus Si, we could figure out the probability of observed xA=34 A responses, xB=13 B responses, and xC=3 C responses. But we don’t know the model – that’s what we want to discover. But we do know the data.

34 13 3 Si

A B C

Prob(xA, xB, xC | p(A|Si )p(B|Si )p(C|Si ) ) =N

xA xB xC

!

"##

$

%&& p(A|Si )


xC

L(xA, xB, xC | p(A|Si )p(B|Si )p(C|Si ) ) =N

xA xB xC

!

"##

$

%&& p(A|Si )


xC

where do these probabilities come from? they’re from the model we’re trying to test

34 13 3 Si

A B C

∑ ∑

∑

∈ ∈

∈

+=

Aj BjijBijA

AjijA

i sbsb

sbSAP )|( ⎟

⎠

⎞⎜⎝

⎛−−= ∑

mmmmij jiwcs exp


xA xB xC

!

"##

$

%&& p(A|Si )


xC


xA xB xC

!

"##

$

%&& p(A|Si )


xC

34 13 3 Si

A B C

∑ ∑

∑

∈ ∈

∈

+=

Aj BjijBijA

AjijA

i sbsb

sbSAP )|( ⎟

⎠

⎞⎜⎝

⎛−−= ∑

mmmmij jiwcs exp


xA xB xC

!

"##

$

%&& p(A|Si )


xC


xA xB xC

!

"##

$

%&& p(A|Si )


xC

34 13 3 Si

A B C

∑ ∑

∑

∈ ∈

∈

+=

Aj BjijBijA

AjijA

i sbsb

sbSAP )|( ⎟

⎠

⎞⎜⎝

⎛−−= ∑

mmmmij jiwcs exp

L(xA, xB, xC | bA,bB,bC,c,w1,w2 ) =N

xA xB xC

!

"##

$

%&& p(A|Si )


xC


xA xB xC

!

"##

$

%&& p(A|Si )


xC

34 13 3 S1

A B C

34 13 3 S2

34 13 3 Sn

L(data | bA,bB,bC,c,w1,w2 ) =Ni

xiA xiB xiC

!

"

##

$

%

&& p(A|Si )

xiA p(B|Si )xiB p(C|Si )

xiC

i=1

n

'

34 13 3 S1

A B C

34 13 3 S2

34 13 3 Sn

∑ ∑∑ ∑∑+−=i i j i j

ijijiji SRPxxNL )|(ln!ln!lnln

Wichmann & Hill fitting psychometric functions (supplemental reading)

motion coherence

accu

racy

are the dots moving right or left?

Psychophysical functions fitting them using maximum likelihood methods

Imagine a Psychophysical Experiment

motion coherence

accu

racy

Why would you fit a Psychophysical Experiment

motion coherence

accu

racy

75%

what level of motion coherence gives 75%


motion coherence

accu

racy

what is the slope of the psychometric function at 75% (or 50% for some applications)?

motion coherence ac

cura

cy

motion coherence

accu

racy


luminance #1

luminance #2

luminance #1

luminance #2

How would you do it using maximum likelihoods?

motion coherence

accu

racy

motion coherence

accu

racy

L(data | params) =Ni

ni,COR ni,INC

!

"

##

$

%

&&p(COR|Si )ni,COR p(INC|Si )

ni,INC

i=1

m

'

motion coherence

accu

racy

L(data | params) =Ni

ni,COR ni,INC

!

"

##

$

%


ni,INC

i=1

m

'

what function defines these?

L(data |!,",#,$) =Ni

ni,COR ni,INC

!

"

##

$

%


ni,INC

i=1

m

'

L(data |!,",#,$) =Ni

ni,COR Ni ( ni,COR

!

"

##

$

%

&&p(COR|Si )i,nCOR (1( p(COR|Si ) )

Ni(ni,COR

i=1

m

'

p(COR|Si ) =%(x;!,",#,$) = # + (1($ (# )F(x;!,")

L(data |!,",#,$) =Ni

ni,COR ni,INC

!

"

##

$

%


ni,INC

i=1

m

'

L(data |!,",#,$) =Ni

ni,COR Ni ( ni,COR

!

"

##

$

%

&&p(COR|Si )i,nCOR (1( p(COR|Si ) )

Ni(ni,COR

i=1

m

'

p(COR|Si ) =%(x;!,",#,$) = # + (1($ (# )F(x;!,")

F(x;!,") =1( exp(((x / ")! )

F(x;!,") = 11+ exp(((x (!) / ")

F(x;!,") = 121+ erf x (!

" 2

!

"#

$

%&

!

"##

$

%&&

F(x;!,") =1( exp((exp((x (!) / "))

Weibull

Logistic

Normal

Gumbel

see week5_psychometric_function.m

Documents

Final Project - Psychological Sciences … · Final Project meet with me over the next week or so to discuss possibilities