CSC 4510 – Machine Learning Dr. Mary‐Angela Papalaskari Department of CompuBng Sciences Villanova University
Course website: www.csc.villanova.edu/~map/4510/
5: Mul'variate Regression
1 CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University
The slides in this presentaBon are adapted from: • Andrew Ng’s ML course hNp://www.ml‐class.org/
Regression topics so far • IntroducBon to linear regression • IntuiBon – least squares approximaBon • IntuiBon – gradient descent algorithm • Hands on: Simple example using excel • How to apply gradient descent to minimize the cost funcBon for regression
• linear algebra refresher
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 2
What’s next? • MulBvariate regression • Gradient descent revisited
– Feature scaling and normalizaBon – SelecBng a good value for α
• Non‐linear regression • Solving for analyBcally (Normal EquaBon) • Using Octave to solve regression problems
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 3
Size (feet2)
Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
1 2104 5 1 45 460 1 1416 3 2 40 232 1 1534 3 2 30 315 1 852 2 1 36 178
What’s next? We are not in univariate regression anymore:
4 CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University
Andrew Ng
Size (feet2)
Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
2104 5 1 45 460 1416 3 2 40 232 1534 3 2 30 315 852 2 1 36 178 … … … … …
Mul'ple features (variables).
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 5
Andrew Ng
Size (feet2)
Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
2104 5 1 45 460 1416 3 2 40 232 1534 3 2 30 315 852 2 1 36 178 … … … … …
Mul'ple features (variables).
NotaBon: = number of features = input (features) of training example. = value of feature in training example.
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 6
Andrew Ng
Size (feet2)
Price ($1000)
2104 460 1416 232 1534 315 852 178 … …
Mul'ple features (variables).
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 7
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 8
For convenience of notaBon, define .
Mul$variate linear regression
Hypothesis: Previously:
Now:
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 9
Hypothesis:
Cost func'on:
Parameters:
(simultaneously update for every )
Repeat Gradient descent:
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 10
(simultaneously update )
Gradient Descent
Repeat Previously (n=1):
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 11
(simultaneously update )
Gradient Descent
Repeat Previously (n=1):
New algorithm : Repeat
(simultaneously update for )
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 12
(simultaneously update )
Gradient Descent
Repeat Previously (n=1):
New algorithm : Repeat
(simultaneously update for )
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 13
E.g. = size (0‐2000 feet2)
= number of bedrooms (1‐5)
Feature Scaling Idea: Make sure features are on a similar scale.
size (feet2)
number of bedrooms
Get every feature into range
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 14
E.g. = size (0‐2000 feet2)
= number of bedrooms (1‐5)
Feature Scaling Idea: Make sure features are on a similar scale.
Replace with to make features have approximately zero mean (Do not apply to ). Mean normaliza'on
E.g.
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 15
Gradient descent
‐ “Debugging”: How to make sure gradient descent is working correctly.
‐ How to choose learning rate .
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 16
0 100 200 300 400
No. of iteraBons
Making sure gradient descent is working correctly.
‐ For sufficiently small , should decrease on every iteraBon. ‐ But if is too small, gradient descent can be slow to converge.
Declare convergence if decreases by less than in one iteraBon?
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 17
Summary: Choosing ‐ If is too small: slow convergence. ‐ If is too large: may not decrease on
every iteraBon; may not converge.
To choose , try
Andrew Ng
Housing prices predic'on
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 18
Andrew Ng
Polynomial regression
Price (y)
Size (x)
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 19
Andrew Ng
Choice of features
Price (y)
Size (x)
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 20
Andrew Ng
Gradient Descent
Normal equaBon: Method to solve for analyBcally.
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 21
Andrew Ng
IntuiBon: If 1D
Solve for
(for every )
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 22
Andrew Ng
Size (feet2)
Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
1 2104 5 1 45 460 1 1416 3 2 40 232 1 1534 3 2 30 315 1 852 2 1 36 178
Size (feet2)
Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
2104 5 1 45 460 1416 3 2 40 232 1534 3 2 30 315 852 2 1 36 178
Examples:
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 23
Andrew Ng
Size (feet2)
Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
1 2104 5 1 45 460 1 1416 3 2 40 232 1 1534 3 2 30 315 1 852 2 1 36 178 1
Size (feet2)
Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
2104 5 1 45 460 1416 3 2 40 232 1534 3 2 30 315 852 2 1 36 178 3000 4 1 38 540
Examples:
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 24
Andrew Ng
examples ; features.
E.g. If
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 25
Andrew Ng
is inverse of matrix .
Octave: pinv(X’*X)*X’*y
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 26
Andrew Ng
training examples, features. Gradient Descent Normal EquaBon
• No need to choose . • Don’t need to iterate.
• Need to choose . • Needs many iteraBons. • Works well even when is large.
• Need to compute
• Slow if is very large.
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 27
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 28
Notes on Supervised learning and Regression hNp://see.stanford.edu/materials/aimlcs229/cs229‐notes1.pdf
Octave hNp://www.gnu.org/sonware/octave/ Wiki: hNp://www.octave.org/wiki/index.php?Btle=Main_Page documentaBon: hNp://www.gnu.org/sonware/octave/doc/interpreter/
CSC 4510 ‐ M.A. Papalaskari ‐ Villanova University 29
Exercise For next class: 1. Download and install Octave (AlternaBve: if you have MATLAB, you can use it instead.) 2. Verify that it is working by typing in an Octave command window:
x = [0 1 2 3] y = [0 2 4 6] plot(x,y) This example defines two vectors, x y and should display a plot showing a straight line (the line y=2x). If you get an error at this point, it may be that gnuplot is not installed or cannot access your display. If you are unable to get this to work, you can sBll do the rest of this exercise, because it does not involve any plorng (just restart Octave). You might refer to the Octave wiki for installaBon help but if you are stuck, you can get some help troubleshooBng this on Friday anernoon 3‐4pm in the sonware engineering lab (mendel 159).
3. Create a few matrices and vectors, eg: A = [1 2; 3 4; 5 6] V = [3 5 ‐1 0 7]
4. Try some of the elementary matrix and vector operaBons from our linear algebra slides (adding, mulBplying between matrices, vectors and scalars)
5. Print out a log of your session