Upload
milton-martin
View
215
Download
0
Embed Size (px)
Citation preview
Linear Regression
Is the link between two factors i.e. one value depends on the other.
E.g. Drivers age – risk of accident. Gender – time spent shopping Car price – depends on age (of car) Sales – depend on Marketing
Crickets and Temperature
Crickets make their chirping sounds by rapidly sliding one wing over the other.
The faster they move their wings, the higher the chirping sound that is produced.
Analysing the data
Now right click on the Trendline and select Format Trendline then select Options – finally select Display equation on Chart
Line of Best Fit
You can see differences between the Measured Values and the Calculated values – why?
Mean Squared Error (MSE)
The mean squared error or MSE of an estimator is the expected value of the square of the "error."
The error is the amount by which the estimator differs from the quantity to be estimated.
The difference occurs because of randomness
or because the estimator doesn't account for information that could produce a more accurate estimate.
Root Mean Square Error
The root mean square error (RMSE) is a frequently-used measure of the difference between values predicted by a model and the values actually observed from the thing being modelled or estimated.
The lower the value of the RMSE the better the fit of observed to calculated data.
Stating the Error
For our Crickets we could then say: Temperature Y = 1.8635X – 3.7532 Where X is the recorded beats per
second of the Crickets wings. Accurate to + or – 2.07 o C
Correlation Coefficient
The correlation coefficient is a measure of how well trends in the predicted values follow trends in the actual values.
It is a measure of how well the predicted values from a forecast model "fit" with the real-life data.
Correlation Coefficient
The correlation coefficient is a number between 0 and +/- 1.
If there is no relationship between the predicted values and the actual values the correlation coefficient is 0 or very low (the predicted values are no better than random numbers).
As the strength of the relationship between the predicted values and actual values increases, so does the correlation coefficient.
A perfect fit gives a coefficient of +/- 1.0. Thus the higher the correlation coefficient the better.
Correlation
Two main methods of calculating correlations are:
Spearman's Rank Correlation Coefficient and
Pearson's or the Product-Moment Correlation Coefficient.
Spearman’s Rank Correlation Coefficient
Spearman's Rank Correlation Coefficient
In calculating this coefficient, we use the Greek letter 'rho' or rThe formula used to calculate this coefficient is:
r = 1 - (6 d2 ) / n(n2 - 1)
Pearson's or Product-Moment Correlation Coefficient
The Pearson Correlation Coefficient is denoted by the symbol r. Its formula is based on the standard deviations of the x-values and the y-values:
Coefficient of Determination R Squared
Shows the amount of variation in y that depends on x
The version most common in statistics texts is based on an analysis of variance decomposition as follows:
SST is the total sum of squares, SSR is the explained sum of squares, and SSE is the residual sum of squares