29

Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Embed Size (px)

Citation preview

Page 1: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In
Page 2: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Regression Analysis:

“Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In regression analysis the variable, which is known, is called independent variable and which is to be estimated is called dependent variable.”

 

Page 3: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

3

Nondependent and Dependent Nondependent and Dependent RelationshipsRelationships

Types of RelationshipTypes of Relationship Nondependent (correlation) -- neither one of Nondependent (correlation) -- neither one of

variables is target Example: protein and fat intakevariables is target Example: protein and fat intake Dependent (regression) -- value of one variable is Dependent (regression) -- value of one variable is

used to predict value of another variable. used to predict value of another variable. Example: ACT and MCAT scores for medical Example: ACT and MCAT scores for medical applicants, MCAT is the dependent and ACT is the applicants, MCAT is the dependent and ACT is the independent variableindependent variable

Statistical ExpressionsStatistical Expressions Correlation CoefficientCorrelation Coefficient -- index of nondependent -- index of nondependent

relationshiprelationship Regression CoefficientRegression Coefficient -- index of dependent -- index of dependent

relationshiprelationship

Page 4: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Regression: 3 Main PurposesRegression: 3 Main Purposes

To describeTo describe (or model) (or model)

To predictTo predict (or estimate) (or estimate)

To controlTo control (or administer) (or administer)

Page 5: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Simple Linear RegressionSimple Linear Regression

Statistical method for findingStatistical method for finding the “line of best fit” the “line of best fit”

for one response (dependent) for one response (dependent) numerical variable numerical variable

based on one explanatory based on one explanatory (independent) variable.  (independent) variable.  

Page 6: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Difference between Correlation and Regression:

1.Degree and nature: Correlation studies the relationship between two or more series but regression analysis measures the degree and extent of this relationship thereby providing a base for estimation.

 

Page 7: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

2.Cause and effect relationship:

Correlation specifies the relationship between two series and it can specify as to what extent is the cause and what is effect. Whereas in regression the value of which series is known is called independent series and whose value is to be predicted is called dependent series. The independent series is cause and independent series is effect.

Page 8: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

3. Limit of co-efficient:

The limit of co-efficient of correlation is plus minus 1 but this is not the case with regression co-efficient. But the product of both the regression co-efficient cannot become greater than 1.

Page 9: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Regression Lines:

The regression analysis between two related series of data is usually done with the help of diagrams.

On the scatter diagram obtained by plotting the various values of related series X and Y, two lines of best fit are drawn through the various points of the diagrams, which are called regression lines.

Page 10: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Why two regression lines?

When there are two series then the lines of regression will also be two.

If the variable values of two series are named as X and Y then one regression line is called X on Y and the other is called Y on X.

Page 11: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Deviations Taken From Arithmetic Mean

(i) Regression Equations of X on Y X – X = r x ( Y – Y) y

Here r x is known as the regression coefficient of X on Y y

It is also denoted by b xy or b1.It measures the change in X corresponding to a unit Change in YAlso b xy = xy / y 2

Page 12: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

(ii) Regression Equations of Y on X Y – Y = r y ( X – X) x

Here r y is known as the regression coefficient of Y on X x

It is also denoted by b yx or b2.It measures the change in Y corresponding to a unit Change in XAlso b yx = xy / x 2

Where x = X –( mean of X series) y = Y –( mean of Y series)

Page 13: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

AlsoRegression equation of X on Y(X – X ) = b1 (Y- Y)

Regression equation of Y on X (Y – Y) = b2 (X – X)

Here r is known as the coefficient of correlation betweenX and Y series. and r = b1x b2

Page 14: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Least square method:

X on Y: Y on X:

X = n.a + b. Y Y = n.a + b. X

XY = Y a + b Y XY = X a + b X

X = a + by Y = a + bx

2 2

Page 15: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Example:

Calculate the regression equations taking deviations of items from the mean of X and Y series.

X Y06 0902 1110 0504 0808 07

Page 16: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Deviations taken from assumed mean(i) Regression Equations of X on Y X – X = r x ( Y – Y) y

Here r x = N dx dy – ( dx dy) / N dy 2 (dy)2

y

(ii) Regression Equation of Y on X

Y – Y = r x ( X – X) y

Here r x = N dx dy – ( dx dy) / N dx 2 (dx)2

y

Page 17: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Example:

From the following data of the rainfall and production of rice, find (i) the most likely production corresponding to rainfall 40 cm (ii) the most likely rainfall corresponding to production 45 kgs.

  Rainfall(cm) Prod (kgs)

Mean 35 50

Std deviation 5 8

Coefficient of correlation between rainfall and production = .8

Page 18: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Example:

Obtain the lines of regression:

X Y

5 2

6 4

5 8

7 5

2 1

 

Page 19: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Example:

From the following series X and Y, find out the value of:

(i)Two regression coefficients.

(ii)Two regression equations.

(iii) Most likely value of X when Y is 34.

(iv) Most likely value of Y when X is 47.

 

Page 20: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Series X Series Y

 48 36

50 32

53 33

49 38

51 37

55 31

53 35

49 30

Page 21: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Example :

The two regression equations are as follows:

 20 X - 3Y = 975………………..(i)

4 Y – 15 X + 530 = 0………………….(ii)

 Find out

(i) Mean value of X and Y

(ii) The coefficient of correlation between X and Y

(iii)Estimate the value of Y, when X = 90; and that of X when Y = 130.

Page 22: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Example:

For a certain X and Y series, the two lines of regressionAre given below

6Y – 5X = 9015X – 8Y = 130Variance of X series is 16

1. Find the mean value of X and Y series2. Coefficient of correlation between X and Y series.3. Standard deviation of Y series.

Page 23: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Real Life ApplicationsEstimating Seasonal Sales for Department

Stores (Periodic)

Page 24: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Real Life ApplicationsPredicting Student Grades Based on Time

Spent Studying

Page 25: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Practice ProblemsMeasure Height vs. Arm SpanFind line of best fit for height.Predict height for

one student not indata set. Checkpredictability of model.

Page 26: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Practice Problems

Is there any correlation between shoe size and height?

Does gender make a difference in this analysis?

Page 27: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Practice ProblemsCan the number of points scored in a basketball game be predicted by The time a player plays in

the game?

By the player’s height?

Page 28: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In

Questions ???Questions ???

Page 29: Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In