Andrew Ng Linear regression with one variable Model representation Machine Learning

Andrew Ng

Linear regression with one variable

Model representation

Machine Learning

Andrew Ng

500 1000 1500 2000 2500 30000

100000

200000

300000

400000

500000

500 1000 1500 2000 2500 30000

100000

200000

300000

400000

500000Housing Prices(Portland, OR)

Price(in 1000s of dollars)

Size (feet2)

Supervised Learning

Given the “right answer” for each example in the data.

Regression Problem

Predict real-valued outputClassification : Discrete-valued output

220

1250

Andrew Ng

Notation:

m = Number of training examples x’s = “input” variable / features y’s = “output” variable / “target” variable

Size in feet2 (x) Price ($) in 1000's (y)2104 4601416 2321534 315852 178… …

Training set ofhousing prices(Portland, OR)

(x, y) – one training example(x(i), y(i)) – ith trainingg example

x(1) = 2104x(2) = 1416y(1) = 460

m

Andrew Ng

Training Set

Learning Algorithm

How do we represent h ?

h maps from x’s to y’s

Size of house

x

h

hypothesis

Estimated price

Estimated value Linear regression with one variable.

Univariate linear regression.

One variable

Andrew Ng

Cost function

Machine Learning


Andrew Ng

How to choose ‘s ?

Training Set

Hypothesis:

‘s: Parameters

Size in feet2 (x) Price ($) in 1000's (y)2104 4601416 2321534 315852 178… …

Andrew Ng

0 1 2 30

1

2

3

0 1 2 30

1

2

3

0 1 2 30

1

2

3

h(x) = 1.5 + 0·x h(x) = 0.5·x

h(x) = 1 + 0.5·x

Andrew Ng

y

x

Idea: Choose so that is close to for our training examples

h(x) = +

(x(i), y(i))

minimize Θ0Θ1Θ0, Θ1

J() =

Minimize J() : Cost Function Θ0Θ1

Squared error function

Andrew Ng

Cost functionintuition I

Machine Learning


Andrew Ng

Hypothesis:

Parameters:

Cost Function:

Goal:

Simplified

h(x) h(x) = 0

Andrew Ng

0 1 2 30

1

2

3

y

x

(for fixed , this is a function of x) (function of the parameter )

-0.5 0 0.5 1 1.5 2 2.50

1

2

3

J() =

=

=

𝐽 (1 )=0

Andrew Ng

0 1 2 30

1

2

3

y

x


-0.5 0 0.5 1 1.5 2 2.50

1

2

3

J() =

= (3.5) = 0.58

y ( i )

hΘ ( x ( i ))

Andrew Ng

-0.5 0 0.5 1 1.5 2 2.50

1

2

3

y

x


0 1 2 30

1

2

3

J() =

= = 2.3

Andrew Ng

Cost functionintuition II

Machine Learning


Andrew Ng

Hypothesis:

Parameters:

Cost Function:

Goal:

Andrew Ng

(for fixed , this is a function of x) (function of the parameters )

500 1000 1500 2000 2500 30000

100000

200000

300000

400000

500000

Price ($) in 1000’s

Size in feet2 (x)

= 50

= 0.06

Andrew Ng

Contour plots

Andrew Ng


Andrew Ng


h(x) = 360 + 0·x

= 360

= 0

Andrew Ng


Andrew Ng


Andrew Ng

Gradient descent

Machine Learning


Andrew Ng

Have some function

Want

Outline:

• Start with some

• Keep changing to reduce

until we hopefully end up at a minimum

Andrew Ng

1

0

J(0,1)

Andrew Ng

0

1

J(0,1)

Andrew Ng

Gradient descent algorithm

Correct: Simultaneous update Incorrect:

Simultaneously update & Learning rate

assignmenta:=b

Andrew Ng

Gradient descentintuition

Machine Learning


Andrew Ng


Learning rate derivative

Andrew Ng

:= -

:= - ≥0

:= -

:= - ≤0

Andrew Ng

If α is too small, gradient descent can be slow.

If α is too large, gradient descent can overshoot the minimum. It may fail to converge, or even diverge.

Andrew Ng

at local optima

Current value of

Andrew Ng

Gradient descent can converge to a local minimum, even with the learning rate α fixed.

As we approach a local minimum, gradient descent will automatically take smaller steps. So, no need to decrease α over time.

Andrew Ng

Gradient descent for linear regression

Machine Learning


Andrew Ng

Gradient descent algorithm Linear Regression Model

Andrew Ng

=

Andrew Ng


update and

simultaneously

Andrew Ng

1

0

J(0,1)

Andrew Ng

Convex function

Bowl-shaped

Andrew Ng


Andrew Ng


Andrew Ng


Andrew Ng


Andrew Ng


Andrew Ng


Andrew Ng


Andrew Ng


Andrew Ng


Andrew Ng

“Batch” Gradient Descent

“Batch”: Each step of gradient descent uses all the training examples.

Documents

Andrew Ng Linear regression with one variable Model representation Machine Learning