When there are two or more variables, there may be some
realitionship between (or among) the variables. In the presence of
randomnes, the relationship between the two variables will not be
unique. For a given value of one variable, there is a range of
possible value of the other variable. 3
Slide 4
4 TYPICAL EXAMPLES to REGRESSION ANALYSIS 1. Size of a house
and the electricty consumption. 2. Size of an automobile and fuel
consumption. 3. Wave height during a storm event and damage at a
breakwater. 4. Precipitation and flow in a basin. 5. Cement/water
ratio in a concrete block and its resistance to presure. 6. The
vehicle number at a certain bridge and time of the day. 7. Flows in
neighboring basins.
Slide 5
5 WHAT IS THE COMMON SIDE OF THESE EXAPLES? These relations are
NOT of a deterministic (functional) character. In other words, when
one of the variables takes a certain value the other will not
always take the same value. This value will change more or less in
various observations with the effect of other variables which we
have not considered in the relation.
Slide 6
6 Result: The relationship between these variables, therefore,
requires a probabilistic approach. The mathematical expression
showing a relation of the above mentioned type is called REGRESSION
EQUATION.
Slide 7
7 Response Regressor Intercept Slope Simple linear
regression
Slide 8
8 price of a house Area of the house Age of the house
Slide 9
Why do we need regression analysis? to check whether there is a
significant relation between the variables under consideration, to
obtain the regression equation expressing this relation and to
evaluate the confidence interval of the estimates to be made by
using this equation. 9
Slide 10
Regression analysis can be classified as follows: 1. Simple
linear regression analysis: it is assumed that there is a linear
relationship between two variables. 2. Multivariate linear
regression analysis: It is assumed that there is a linear
relationship among more than two variables. 3. Nonlinear regression
analysis: it is assumed that a nonlinear relation of which the form
is expressed by a pre-selected equation exists between two or more
variables. 10
Slide 11
Correlation coefficient being X,Y =0 shows that there is NO
linear dependence between X and Y, whereas the absolute value of
X,Y approaching 1 means that the dependence between the variables
strengthens approaching a deterministic relationship. (However,
correlation coefficient having a value close to 1 does not always
show a cause and effect relationship between the two variables.
Both of the variables being effected by another variable may be the
cause of a high correlation coefficient) 11
Slide 12
12 In cases of small samples it is required to divide by N-1
instead of N in Eq. (7.1).
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
The estimated r X,Y statistic is NOT equal to the population
X,Y value. We must know the sampling distribution of r X,Y
statistic to decide about the dependence by considering the
estimated r X,Y value. 17
Slide 18
The Method of Least Squares 18 y=ax+b a= b=
Slide 19
Example: Annual flows (10 6 m 3 ) at two stations on the river
Dicle are given below: 19 Year19561957195819591960196119621963
X16077148171172093521053767431016229232
Y462945562507161221251054227211883
Year19641965196619671968196919701971
X1643913019177292036826748335661231410914
Y-4041519153286543760634453161 However it is not sufficient only to
plot the observed value pairs to decide whether there is a
significant relationship between the two random variables.
Slide 20
x = 16273 y = 4397s X = 7669s Y = 2768 If we substitute the
values into the correlation coef. eq. 20 r X,Y = 0.868
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
Some assumptions of regression analysis 1-The deviations of
scatter points from the fitted curve have zero average with assumed
constant variance. 2-the fitting of a regression curve may pass
close to a certain percentage of points, but this cannot account
for the validity of the method; 3- The prediction errors are
expected to abide with a Gaussian distribution function which is
not the case in many practical studies 25
Slide 26
Computer applications with Excel 26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
Regression analysis by STATISTICA 30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36 Question: Please derive regression equation for the
following variables