24
Probabilistic and Probabilistic and Statistical Techniques Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Embed Size (px)

Citation preview

Page 1: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Probabilistic and Probabilistic and Statistical TechniquesStatistical Techniques

1

Lecture 24

Eng. Ismail Zakaria El Daour

2010

Page 2: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Correlation

Probabilistic and Statistical Techniques

Page 3: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

OverviewOverview

This chapter introduces important methods for making inferences about a correlation (or relationship) between two variables, and describing such a relationship with an equation that can be used for predicting the value of one variable given the value of the other variable.

Probabilistic and Statistical Techniques

Page 4: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Key Concept

This section introduces the linear

correlation coefficient r, which is a numerical measure of the strength of the relationship between two variables representing quantitative data.

Because technology can be used to find the value of r, it is important to focus on the concepts in this section, without becoming overly involved with tedious arithmetic calculations.

Probabilistic and Statistical Techniques

Page 5: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

DefinitionDefinition

A correlation exists between two variables when one of one of them is related to the other them is related to the other in some way.

Probabilistic and Statistical Techniques

Page 6: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Definition

The linear correlation coefficient rr measures the strength of the linear relationship between paired x- and y- quantitative values in a sample.

Probabilistic and Statistical Techniques

Page 7: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Scatter plots of Paired DataProbabilistic and Statistical Techniques

Page 8: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Scatter plots of Paired Data

Probabilistic and Statistical Techniques

Page 9: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

RequirementsRequirements

1.The sample of paired (x, y) data is a random sample of independent quantitative data.

2.Visual examination of the scatter plot must confirm that the points approximate a straight-line pattern.

3.The outliers must be removed if they are known to be errors. The effects of any other outliers should be considered by calculating r with and without the outliers included.

Probabilistic and Statistical Techniques

Page 10: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Notation for the Linear Correlation Notation for the Linear Correlation CoefficientCoefficient

n represents the number of pairs of data present.

denotes the addition of the items indicated.

x denotes the sum of all x-values.

x2 indicates that each x-value should be squared and then those squares added.

(x)2 indicates that the x-values should be added and the total then squared.

xy indicates that each x-value should be first multiplied by its corresponding y-value. After obtaining all such products, find their sum.

r represents linear correlation coefficient for a sample.

represents linear correlation coefficient for a population.

Page 11: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

The linear correlation coefficient rr measures the strength of a linear relationship between the paired values in a sample.

Formula

Probabilistic and Statistical Techniques

Page 12: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Example: Calculating rr

Using the simple random sample of data below, find the value of r.

x 3 1 3 5y 5 8 6 4

Probabilistic and Statistical Techniques

Page 13: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Example: Calculating r - cont

Probabilistic and Statistical Techniques

Page 14: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Example: Calculating r - cont

x 3 1 3 5y 5 8 6 4

Probabilistic and Statistical Techniques

Page 15: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Properties of the LinearProperties of the Linear Correlation Coefficient Correlation Coefficient rr

1. –1 –1 rr 1 1

2. The value of rr does not change if all values of either variable are converted to a different scale.

3. The value of r is not affected by the choice of x and y. Interchange all x- and y-

values and the value of r will not change.

4. r measures strength of a linear relationship.

Probabilistic and Statistical Techniques

Page 16: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Explained Variation Explained Variation (coefficient of variation) (coefficient of variation)

The value of rr22 is the proportion of the variation (coefficient of variation) in yy that is explained by the linear relationship between x and y.

Probabilistic and Statistical Techniques

Page 17: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Using the duration/interval data an Example, we have found that the value of the linear correlation coefficient r = r = 0.9260.926. What proportion of the variation?

With r = 0.926, we get r2 = 0.857.We conclude that 0.857 (or about 86%) of

the variation (y) (y) can be explained by the

linear relationship between x x and and yy.

This implies that 14% of the variation in yy after cannot be explained by the linear

relationship between x x and and yy.

Example:Probabilistic and Statistical Techniques

Page 18: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Common Errors Common Errors Involving CorrelationInvolving Correlation

1. Averages: Averages suppress individual variation and may inflate the correlation coefficient.

2. Linearity: There may be some relationship between x and y even when there is no linear correlation.

Probabilistic and Statistical Techniques

Page 19: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Note: confidence interval

A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population parameter.

Probabilistic and Statistical Techniques

Page 20: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Confidence Interval for Population Proportion

where

Probabilistic and Statistical Techniques

Page 21: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

General ExampleGeneral Example The figure reveals that μ lies in the 95.44% confidence interval in 19 of the 20 samples, that is, in 95% of the samples We can be 95.44% confident that any computed 95.44% confidence interval will contain μ.

Probabilistic and Statistical Techniques

Page 22: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

Example The U.S. Bureau of Labor Statistics collects information on the ages of people in the civilian labor force and publishes the results in Employment and Earnings. Fifty people in the civilian labor force are randomly selected. Find a 95% confidence interval for the mean age, μ=36.38, of all people in the civilian labor force. Assume that the population standard deviation of the ages is 12.1 years.

Probabilistic and Statistical Techniques

Page 23: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

36.38

12.1

50n

12.1 12.136.38 1.96 36.38 1.96

50 50to

Page 24: Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010

24