2
Multicollinearity The term multicollinearity was first used by Ragnar Frisch. Multicollinearity means that there is a perfect or exact relationship between the regression exploratory variables. Linear regression analysis assumes that there is no perfect exact relationship among exploratory variables. In regression analysis, when this assumption is violated, the  problem of Multicollinearity  occurs. Statistics Solutions is the country's leader in dissertation statistical consulting and can assist with your regression analysis. Contact Statistics Solutions today for a free 30- minute consultation. In regression analysis, multicollinearity has the following types: 1. No multicollinearity: When the regression exploratory variables have no relationship with each other, then there is no multicollinearity in the data. 2. Low multicollinearity: When there is a relationship among the exploratory variables,  but it is very low, then it is a type of low multicollinearity. 3. Moderate multicollinearity: When the relationship among the exploratory variables is moderate, then it is said to be moderate multicollinearity. 4. High multicollinearity: When the relationship among the exploratory variables is high or there is perfect correlation among them, then it said to be high multicollinearity. 5. Very high multicollinearity: When the relationship among the exploratory variables is exact, then it is the problem of very high multicollinearity, which should be removed from the data when regression analysis is conducted. Many Factors affect multicollinearity. For example, multicollinearity may exist during the data collection process, or multicollinearity may exist due to the wrong selection of the model. For example, if we take the exploratory variables to be income and house size in our model, then the model will have the problem of  multicollinearity because income and house size are h ighly correlated. Multicollinearity may also occur if we take too many exploratory variables in regression analysis. Consequences of multicollinearity: If the data has a perfect or exact multicollinearity   problem, then the following will be the impact of multicollinearity: 1. In the presence of multicollinearity, variance and covariance will be wider, which will make it difficult to reach a statistical decision for the n ull and alternative hypothesis. 2. In the presence of multicollinearity, the confidence interval will be wider due to the wider confidence interval. In this case, we will accept the null hypothesis, which should  be rejected. 3. In the presence of multicollinearity, the standard error will increase and it makes the value of the t-test smaller. We will accept the null hypothesis that should be rejected. 4. Multicollinearity will increase the R-square as well, which will impact the go odness of fit of the model.

Multicollinearit1

Embed Size (px)

Citation preview

Page 1: Multicollinearit1

8/4/2019 Multicollinearit1

http://slidepdf.com/reader/full/multicollinearit1 1/2

Multicollinearity

The term multicollinearity was first used by Ragnar Frisch. Multicollinearity means that

there is a perfect or exact relationship between the regression exploratory variables.

Linear regression analysis assumes that there is no perfect exact relationship among

exploratory variables. In regression analysis, when this assumption is violated, the problem of Multicollinearity occurs.

Statistics Solutions is the country's leader in dissertation statistical consulting and canassist with your regression analysis. Contact Statistics Solutions today for a free 30-

minute consultation.

In regression analysis, multicollinearity has the following types:

1. No multicollinearity: When the regression exploratory variables have no relationshipwith each other, then there is no multicollinearity in the data.

2. Low multicollinearity: When there is a relationship among the exploratory variables, but it is very low, then it is a type of low multicollinearity.3. Moderate multicollinearity: When the relationship among the exploratory variables

is moderate, then it is said to be moderate multicollinearity.

4. High multicollinearity: When the relationship among the exploratory variables is high

or there is perfect correlation among them, then it said to be high multicollinearity.5. Very high multicollinearity: When the relationship among the exploratory variables

is exact, then it is the problem of very high multicollinearity, which should be removed

from the data when regression analysis is conducted.

Many Factors affect multicollinearity. For example, multicollinearity may exist during

the data collection process, or multicollinearity may exist due to the wrong selection of the model. For example, if we take the exploratory variables to be income and house size

in our model, then the model will have the problem of  multicollinearity because income

and house size are highly correlated. Multicollinearity may also occur if we take too

many exploratory variables in regression analysis.

Consequences of multicollinearity: If the data has a perfect or exact multicollinearity 

 problem, then the following will be the impact of multicollinearity:

1. In the presence of multicollinearity, variance and covariance will be wider, which will

make it difficult to reach a statistical decision for the null and alternative hypothesis.

2. In the presence of multicollinearity, the confidence interval will be wider due to thewider confidence interval. In this case, we will accept the null hypothesis, which should

 be rejected.

3. In the presence of multicollinearity, the standard error will increase and it makes thevalue of the t-test smaller. We will accept the null hypothesis that should be rejected.

4. Multicollinearity will increase the R-square as well, which will impact the goodness of 

fit of the model.

Page 2: Multicollinearit1

8/4/2019 Multicollinearit1

http://slidepdf.com/reader/full/multicollinearit1 2/2

Detection of multicollinearity: The following are the methods that show the presence of 

multicollinearity:

1. In regression analysis, when R-square of the model is very high but there are very few

significant t ratios, this shows multicollinearity in the data.

2. High correlation between exploratory variables also indicates the problem of multicollinearity.

3. Tolerance limit and variance inflating factor: In regression analysis, one-by-one

minus correlation of the exploratory variable is called the variance inflating factor. As thecorrelation between the repressor variable increases, VIF also increases. More VIF shows

the presence of multicollinearity. The inverse of VIF is called Tolerance. So the VIF and

TOI have a direct connection.

Remedial measure of multicollinearity: In regression analysis, the first step is to detect

multicollinearity. If multicollinearity is present in the data, then we can solve this

 problem by taking several steps. The first step is to drop the variable, which has the

specification bias of multicollinearity. By combining the cross sectional data and the timeseries data, multicollinearity can be removed. If there is a high multicollinearity, then it

can be removed by transforming the variable. By taking the first or the second, differentvariables can be transformed. By adding some new data, multicollinearity can be

removed. In multivariate analysis, by taking the common score of the multicollinearity

variable, multicollinearity can be removed. In factor analysis, principle component

analysis is used to drive the common score of multicollinearity variables. A rule of thumbto detect multicollinearity is that when the VIF is greater than 10, then there is a problem

of multicollinearity.