Yard. Doç. Dr. Tarkan Erdik Regression analysis - Week 12 1

Embed Size (px)

Citation preview

  • Slide 1
  • Yard. Do. Dr. Tarkan Erdik Regression analysis - Week 12 1
  • Slide 2
  • 2
  • Slide 3
  • When there are two or more variables, there may be some realitionship between (or among) the variables. In the presence of randomnes, the relationship between the two variables will not be unique. For a given value of one variable, there is a range of possible value of the other variable. 3
  • Slide 4
  • 4 TYPICAL EXAMPLES to REGRESSION ANALYSIS 1. Size of a house and the electricty consumption. 2. Size of an automobile and fuel consumption. 3. Wave height during a storm event and damage at a breakwater. 4. Precipitation and flow in a basin. 5. Cement/water ratio in a concrete block and its resistance to presure. 6. The vehicle number at a certain bridge and time of the day. 7. Flows in neighboring basins.
  • Slide 5
  • 5 WHAT IS THE COMMON SIDE OF THESE EXAPLES? These relations are NOT of a deterministic (functional) character. In other words, when one of the variables takes a certain value the other will not always take the same value. This value will change more or less in various observations with the effect of other variables which we have not considered in the relation.
  • Slide 6
  • 6 Result: The relationship between these variables, therefore, requires a probabilistic approach. The mathematical expression showing a relation of the above mentioned type is called REGRESSION EQUATION.
  • Slide 7
  • 7 Response Regressor Intercept Slope Simple linear regression
  • Slide 8
  • 8 price of a house Area of the house Age of the house
  • Slide 9
  • Why do we need regression analysis? to check whether there is a significant relation between the variables under consideration, to obtain the regression equation expressing this relation and to evaluate the confidence interval of the estimates to be made by using this equation. 9
  • Slide 10
  • Regression analysis can be classified as follows: 1. Simple linear regression analysis: it is assumed that there is a linear relationship between two variables. 2. Multivariate linear regression analysis: It is assumed that there is a linear relationship among more than two variables. 3. Nonlinear regression analysis: it is assumed that a nonlinear relation of which the form is expressed by a pre-selected equation exists between two or more variables. 10
  • Slide 11
  • Correlation coefficient being X,Y =0 shows that there is NO linear dependence between X and Y, whereas the absolute value of X,Y approaching 1 means that the dependence between the variables strengthens approaching a deterministic relationship. (However, correlation coefficient having a value close to 1 does not always show a cause and effect relationship between the two variables. Both of the variables being effected by another variable may be the cause of a high correlation coefficient) 11
  • Slide 12
  • 12 In cases of small samples it is required to divide by N-1 instead of N in Eq. (7.1).
  • Slide 13
  • 13
  • Slide 14
  • 14
  • Slide 15
  • 15
  • Slide 16
  • 16
  • Slide 17
  • The estimated r X,Y statistic is NOT equal to the population X,Y value. We must know the sampling distribution of r X,Y statistic to decide about the dependence by considering the estimated r X,Y value. 17
  • Slide 18
  • The Method of Least Squares 18 y=ax+b a= b=
  • Slide 19
  • Example: Annual flows (10 6 m 3 ) at two stations on the river Dicle are given below: 19 Year19561957195819591960196119621963 X16077148171172093521053767431016229232 Y462945562507161221251054227211883 Year19641965196619671968196919701971 X1643913019177292036826748335661231410914 Y-4041519153286543760634453161 However it is not sufficient only to plot the observed value pairs to decide whether there is a significant relationship between the two random variables.
  • Slide 20
  • x = 16273 y = 4397s X = 7669s Y = 2768 If we substitute the values into the correlation coef. eq. 20 r X,Y = 0.868
  • Slide 21
  • 21
  • Slide 22
  • 22
  • Slide 23
  • 23
  • Slide 24
  • 24
  • Slide 25
  • Some assumptions of regression analysis 1-The deviations of scatter points from the fitted curve have zero average with assumed constant variance. 2-the fitting of a regression curve may pass close to a certain percentage of points, but this cannot account for the validity of the method; 3- The prediction errors are expected to abide with a Gaussian distribution function which is not the case in many practical studies 25
  • Slide 26
  • Computer applications with Excel 26
  • Slide 27
  • 27
  • Slide 28
  • 28
  • Slide 29
  • 29
  • Slide 30
  • Regression analysis by STATISTICA 30
  • Slide 31
  • 31
  • Slide 32
  • 32
  • Slide 33
  • 33
  • Slide 34
  • 34
  • Slide 35
  • 35
  • Slide 36
  • 36 Question: Please derive regression equation for the following variables