278
Please read carefully all the following teaching notes before to proceed to each chapter. Please remember to contact me if you have any level of difficulty. I am at home waiting for your e-mail and happy to help you as usual. The teaching notes are not in order. They start from bottom to top. Please check the Table of contents to understand the logical order of each chapter. Thanks for your participation. I cannot upload the whole document, so I have split it into three documents. Introduction to Econometrics 0, Introduction to Econometrics 1 and Introduction to Econometrics 2. I am sorry for this inconvenience. I hope that you get the best investment, and financial jobs in the city and around the U.K. You deserve it. You did your homework and your best to educate yourself. I have added a detailed section related to matrix algebra as introductory session. Please make yourself familiar with the different types of distributions that are related with a random variable such as the normal distribution, the t-distribution, the chi-squared distribution and the F-distribution. In derivatives investment, we use the cumulative distribution function and the probability density function to price call and put options. Please make yourself familiar with the small sample resampling methods . They are widely used in investment. Two methods that are used are the Monte Carlo method of antithetic variables and the bootstrap method. Please check the section resampling methods for more information. I have added an article to help you understand the bootstrap method and a detailed example of Monte Carlo method of antithetic variables compared with the Black and Scholes model. The Black and Scholes model is used in derivatives pricing of call and put options. The bootstrap method was introduced by Efron (1979). It is based on random resampling from a sample under study. The purpose of resampling is to reduce bias and get lower 1

Introduction to Econometrics 0

Embed Size (px)

Citation preview

Page 1: Introduction to Econometrics 0

Please read carefully all the following teaching notes before to proceed to each chapter. Please remember to contact me if you have any level of difficulty. I am at home waiting for your e-mail and happy to help you as usual. The teaching notes are not in order. They start from bottom to top. Please check the Table of contents to understand the logical order of each chapter. Thanks for your participation.

I cannot upload the whole document, so I have split it into three documents. Introduction to Econometrics 0, Introduction to Econometrics 1 and Introduction to Econometrics 2. I am sorry for this inconvenience.

I hope that you get the best investment, and financial jobs in the city and around the U.K. You deserve it. You did your homework and your best to educate yourself.

I have added a detailed section related to matrix algebra as introductory session. Please make yourself familiar with the different types of distributions that are related with a random variable such as the normal distribution, the t-distribution, the chi-squared distribution and the F-distribution. In derivatives investment, we use the cumulative distribution function and the probability density function to price call and put options.

Please make yourself familiar with the small sample resampling methods. They are widely used in investment. Two methods that are used are the Monte Carlo method of antithetic variables and the bootstrap method. Please check the section resampling methods for more information. I have added an article to help you understand the bootstrap method and a detailed example of Monte Carlo method of antithetic variables compared with the Black and Scholes model. The Black and Scholes model is used in derivatives pricing of call and put options. The bootstrap method was introduced by Efron (1979). It is based on random resampling from a sample under study. The purpose of resampling is to reduce bias and get lower standard errors. There is the option to get a lot of bootstrap samples to get lower standard errors and construct prediction intervals. You have the option to bootstrap the data versus the residuals. Use the percentile method. Bootstrap the residuals from the regression model and check for autocorrelation. Bootstrap the data and check for heteroskedasticity. Good luck for those that will work as investment analyst in the major investment banks around the U.K.

To sum up, once you have formulated the financial or the economic theory, then, you check whether there is anything wrong with the model. You check the signs of the coefficients in relation to the literature review and empirical evidence from different research articles. Then, check the residuals. Remove the outliers or the large residuals. Please check the different types of residuals. Apply the different tests in terms of autocorrelation, heteroskedasticity, omitted variables, errors in variables, and multicollinearity. Check whether the model could be improved or modified. Test for misspecification. Check the Hausman test for endogeneity. For example, you want to check how gross domestic product, GDP is endogenously affected by money supply and other variables such as industrial production. You run two OLS regressions. In the first regression, we regress the dependent

1

Page 2: Introduction to Econometrics 0

variable GDP in relation to the independent variables and get the residuals. In the second regression, we regress the money supply in relation to the independent variables by incorporating the residuals from the first regression. Check for consistency if the coefficients are significant different from zero in both regressions. Sometimes, we want to check that the random effects in the history of share prices, mutual funds or corporate bonds are uncorrelated with the dependent variable. Hausman test compare the fixed and random effects of the coefficients. Select view …. fixed…. Random effects testing….. correlated random effects – Hausman test. Check the t-statistic and the probability at the 5% significance level to check for misspecification.

I have attached a layout of the cross – section and idiosyncratic random.

Dependent Variable:Method:Date:Sample:Included observations:Number of cross-sections used:Total pool (balanced) observations:Wallace and Hussain estimator of component variances

Variable Coefficient Std. Error t-Statistic Prob.

GDPMID

Cross – section randomIdiosyncratic random

To test for omitted variables, select view, then, coefficient diagnostics, then, omitted variables-likelihood ratio. Insert the name of the new variables that were omitted from the test each separated by a space. The omitted variables should be listed in your dataset. For example, consider the regression equation in EViews 6 as ln y = ln c ln d. The omitted variables are the square roots of c and d. You insert them in the box. EViews 6 reports statistics related to hypothesis testing that the coefficients of the omitted variables are jointly zero. Check the F-statistic and the likelihood ratio statistic, LR and concludes based on the 5% significance level. If the numerical values are below 5%, then, reject the null hypothesis and accept the alternative hypothesis. In case that you want to test for the ARCH effect. Regress the dependent variable on the independent variables and obtain the estimated least squares residuals. Regress the squared estimated residuals on the squares estimated residuals lagged one period. Test whether the coefficients are zero. Spot if there is significant ARCH effect by using the estimated squared residuals lagged one period as a proxy for omitted lagged values of the dependent and independent variables. Make sure that you include many lagged values in your regression equation to be able to spot if there is autoregressive conditional heteroskedastic effect, ARCH.

I do not have access to SAS regression program, which is a statistical software. I am sorry, I do not have access to this software. It shows the difference between different types of residuals calculated from a regression equation. They are

2

Page 3: Introduction to Econometrics 0

important to detect outliers that should be removed and then re-estimate the regression equation. Thus, you have the OLS residuals, the studentized and the deficiency in fit, DFFIT, residuals. The OLS residuals are calculated from the regression equation. The problem is that they do not have the same variance. Observations with large residuals are defined as outliers and are removed from the sample. We also do not know the importance of the single observation in relation to the overall observations in the sample. The studentized residual is defined as predicted residual divided by its standard error and is used to detect outliers. The DFFIT method is also used to detect outliers by examining the changes of the estimated value of the dependent variable from the original value of the dependent variable after dropping an observation. The purpose is to see that in an aggregate level the outliers are minimized and do not affect the coefficients. Please use the weighted least squares method as it minimizes the effect of the outliers. Compare the three methods and decide which one is the best for detecting outliers. Please contact SAS software organisation to see how they can help you. Please select the model that minimizes the sum of squared prediction errors. There are also the recursive residuals, which are obtained from a regression with n observations and getting the predicted residual from the n+1 observation. You gather all the predicted residuals, which are the recursive residuals. They are independent. They have a common variance. Good luck!

Please make sure that you are familiar with the concepts random walk, moving average process, autoregressive process, autoregressive moving average process and autoregressive integrated moving average process. You will hear in the investment industry that share prices and other investment vehicles behave according to the random walk theory. According to this theory the mean and variance change with time and the process is non - stationary. Once, you differentiate the time – series of the share price it becomes stationary. Check the correlogram graph of the time – series. The correlogram of a stationary series drops substantially as the number of lags becomes large. Once you have the individual series or variable in EViews 6, press view and then correlogram. The same procedure you can do once you have the regression output. Press view, then, residual tests, then, correlogram – Q – statistics or correlogram-squared residuals. Please check the correlogram and the partial autocorrelations that I have attached from EViews 6 in the linear regression section. When using seasonal data as dummies check the correlogram first. Check the autocorrelation and partial autocorrelation and determine whether the data are stationary or you need to differentiate them to become stationary. In other words, check the correlogram to determine the autoregressive moving average model that will be used, (ARMA). For example, if you have 200 observations and only 3 observations are significant in terms of serial correlations, then use a moving average, MA(3). Check the lags number according to the correlogram autocorrelation and partial autocorrelation function.

I have included a detailed section related to stationarity in the regression and the time –series section. Check for random walk with and without a drift. Check for stationarity by including an intercept and a trend. When we are dealing we share prices or investment products check for the trend. Check for example that the t-statistic of the share (-1) is the same as the Augmented Dickey-Fuller test

3

Page 4: Introduction to Econometrics 0

statistic, (ADF). Compare the ADF test with the critical values and check that the data are constant around the mean. We are testing the null hypothesis that the share price has a unit root. Therefore, if the ADF statistic is greater than the critical value, we accept the hull hypothesis of unit root. Do not use the Phillips test for stationarity as it is sensitive to misspecifications that are arising from outliers. It also allows for heteroskedasticity effect and serial correlation. Calculate Harvey R squared for misspecifications. It is defined as 1 – residual sum of squares from the estimated model / residual sum of squares from a random walk with drift. If you get a negative value, the model is not good.

Moving average process was covered in the seminar group of time – series. They are used in econometrics to eliminate a trend. For example, sales are increasing during Christmas or share prices increase during specific month due to dividends or other factors. Check for seasonal dummies effects and upward or downward trends. Moving average process uses the error term as a random process lagged different periods in a regression equation to check for unobserved variables. Autoregressive process is related of regressing a variable on its own past values using lag periods with mean zero and variance sigma squared. Autoregressive moving average process is a combination of autoregressive and moving average process. Autoregressive integrated moving average process, ARIMA, is used to convert a non-stationary series into a stationary through differencing. ARIMA is of order (p,d,q). The ARMA model is of order (p,q). Please read the following article in the truncated, dummy variables and panel data analysis section.

Comparison of the census X12 and Tramo / Seats additive ARIMA(p,d,q) seasonal adjustment models applied to Credit Suisse asset management income and Templeton global income closed-end funds. Application of a binary logit regression.

In the above article, you are going to find criteria of goodness of fit such as the Akaike information criterion, (AIC) and the Schwarz criterion, (SBC or BIC). When you consider different ARMA models, then, choose the model with the lowest AIC or BIC value. In addition, check the serial correlation pattern of the residuals. Please make sure that there is no correlation in the error term. Make sure that the errors are white noise. But what is the meaning of white noise?White noise is a random process of mutually independent distributed random variables. It has a constant mean and variance. If there is unit root, differentiate the series to eliminate the correlation effect of the error term. Please also remember that with autoregressive models, we have to use Durbin’s h-test or the Breusch - Godfrey LM test.

When we analyse long-term relationships between non – stationary variables or stationary at their first differences, I(1). You have to make sure that both the numerical values of the dependent and independent variables are I(1). You can not regress a time – series I(0) on a time – series that is of order I(1). Both time –series have to be of the same order. We discuss about cointegration and vector autoregressions models, (VAR) in case of I(1) time - series. Thus, stationarity is very important concept to understand before making further analysis of your data. Check that the dependent and independent variables are not drifting away

4

Page 5: Introduction to Econometrics 0

over time. A long –run relationship shows that there is cointegrated effect between the time –series. In contrast, if time passes and the two series are drifting away, then, there is no long – run relationship between the time – series and we discuss about spurious relationship. The null hypothesis that is tested is that there is no cointegration. Therefore, H0: y and x are cointegrated. The alternative hypothesis H1: y and x are not cointegrated. If the variables are I(1) but there is no cointegration effect, then, use an unrestricted VAR model. Short and medium – run relationships between the variables could be diagnosed using a vector correction model, (VEC). This model is used to test the Granger causality between the variables. Check the interdependence of different time – series as simultaneous equations with lags and classify the variables as endogenous or exogenous. In this case, a good start is to use the vector autoregression, (VAR) approach. Basically, this approach is based on the ordinary least squares method or regression model. Please check the following two research papers:

Application of an Unrestricted Vector Autoregressive system in the term structure of the US interest rates. Evidence from short, medium and long-term yields of the US interest rates.

An application of a Johansen Cointegration test and a Vector Error Correction, (VEC) model to test the Granger causality between general government revenues and general government total expenditures in Greece

To sum up, write your regression equations with the exogenous and endogenous variables accompanied by lags according to the economic or financial theory that you are testing. Make sure that the economic or the financial interpretation of the variables makes sense. For example, the interrelation of the money supply with inflation, interest rates, and the gross national product, (GNP). Classify which variables are endogenous and which variables are exogenous. Determine how many lags are you going to use.

I have added further explanations for censored and truncated regression models. I have added instructions on how to use a count model in EViews 6. The dependent variable in a count model should be an integer value followed by a list of regressors. Please check the dummy and truncated variables section. I want to make sure that you are comfortable on how to use pooled time series, cross-section data and panel data in EViews 6. It is important to understand how to organize the different types of data in your work file to facilitate your statistical analysis. I have attached a research paper to help you understand the use of quantile regression and Pedroni residual cointegration test in EViews 6. Thanks for your participation, attention and willingness to learn new academic concepts. I insist that you deserve to get recruited in the best financial jobs around the U.K.

I have covered the section simultaneous equations in econometrics and time - series. Thus, if we have more than one equation between the variables, then, we discuss about simultaneous regression models. You will find a detailed example in the research paper that I have included in this chapter. The research paper is using different methodology by including jointly determined equations. It does

5

Page 6: Introduction to Econometrics 0

not include reduced form equations or reduced form parameters. My purpose was to show you how structural equations are defined in terms of endogenous and exogenous variables with lagged periods. If an equation is overidentified or underidentified, the indirect least squares method is not going to work properly. Use in EViews 6, the method of two - stage least squares, (2SLS) to estimate the coefficients. The 2SLS method involves estimating the reduced form equations. Obtain the predicted values of the endogenous variable. Replace them in the structural equations by their predicted value and re-estimate the equation. Please pay attention to the different tests related to stationarity of time –series analysis.

I have added an exercise related to the Durbin’s two – step method. As an exercise check the Cochrane-Orcutt two-step method and the Hildreth-Lu search procedure. I have added a new section in the autocorrelation chapter. It is related to the mathematical h – statistic of testing for serial correlation in models with lagged dependent variable. We use this statistic in autoregressive models. Lagged dependent variable are used as explanatory variable when we have adjustment lags and expectations of an investment product. Another test that you can use to detect first order serial correlation is the Breusch – Godfrey LM test. Finally, I have added the Newey and West test that adjust the standard erros values and t-statistics from the heteroskedasticity and autocorrelation problem. You have to check the box when running the regression equation in EViews 6. if autocorrelation persist, you need to transform your variables, change them or increase the sample size. Thanks for your time and patience.

Another alternative to eliminate the heteroskedasticity problem and get efficient estimates of the coefficients is to deflate the variables by using the reciprocal of the square root or the original value of one of the independent variables. If you have more than one independent variables that show heteroskedastic problem, then, use the square root of the mean value of the dependent variable as a transformed value. You could also use the ratio from a new variable. The problem that could arise is spurious correlation between the new variable related to the dependent and independent variables. The above methods are applied when the variance of the model is unknown. When the error variance is known, then, the transformed regression is obtained by dividing each dependent and independent variable by its own standard deviation. In EViews 6 you select the weighted least squares option because each variable is weighted or divided by the relevant dataset standard deviation. I have added further explanations of the White heteroskedastic test. I have added detailed explanations of the steps involved to calculate the deflator. Then, use different heteroskedastic tests.

I have added two research papers to help you understand the binary logit model and the wald test application in exchange rates by using an ARCH model. I am working very hard to review and cover the required teaching material. I have added an article to help you understand how the bootstrap resampling methods are applied in investment trusts. I am keep adding new exercises to reinforce your learning experience in a peaceful and relaxed manner. I have started the dummy variables section. I have added a hedge fund article to help you

6

Page 7: Introduction to Econometrics 0

understand in practice how dummy variable is applied in a probit regression equation.

I have added two articles to help you understand in practice cointegration, Granger causality and unrestricted vector autoregressive system. The purpose of writing the articles is to strengthen your research skills in terms of designing the research problem and applying methodological issues. Good luck once again. I have added new exercises in the regression and multiple regression section. I may amend some exercises. I am working to cover the dummy variables. At this stage, please have a quick look. There is plenty of work and corrections to be done.. I have covered the multicollinearity section.

The use of first differences regression equation instead of levels is a useful technique for correcting the autocorrelation problem. It leads to larger Durbin – Watson statistic and lower residual sum of squares. Please note that the firt differences equation leads to decreased R2, larger DW value and lower residual sum of squares than for the equation in the levels. Please not that R2 is not comparable between different models. We prefer models with high R2, as it is an indication that there is a strong relationship between the variables. When we use first differences we usually gets low R2. A possible explanation of this phenomena is that the regression equation is misspecified. For example, there are missing independent variables that could explain a high degree of changes of the dependent variable.

I have added Golddfeld – Quandt’s test and Ramsey’s test. I have added a new section related to heteroskedasticity to show the difference between homoskedastic, (the error term has the same variance) and heteroskedastic effect, (the error term does not have the same variance). It is an exercise that shows the relation of expenditures and incomes. I am progressing very fast. I am still working to cover different tests. If there is heteroskedasticity, then, the standard errors are biased. I would like to see you in the best places of the Financial Services sector. Thanks for your patience and for your participation. I appreciate your endeavour to succeed in your future career.

I have added a new section in the multiple regression related to analysis of variance or the Chow test. It is related to prediction and breakpoints for different time periods. Please make sure that you understand the hypotheses testing. Check the F-statistic and the related p-value of your regression output to find out if the regression model is highly significant. Check your results at the 5% significance level.

There is plenty of work to be done! I am still revising the sections for mistakes and omissions. If you want, you can have a quick look. There are many sections to be added. If you could contribute with comments and suggestions, I will be very happy. Two - way communication will facilitate my work and help me to understand your learning attitude and needs. This will create a positive attitude to learn and succeed in your future career.

Thanks for your patience. E-mail me on [email protected]

7

Page 8: Introduction to Econometrics 0

Good luck and I wish you success in your career as an investment econometrician.

Introduction to Econometrics. A practical guide for first, second and third year undergraduate, postgraduate and research students.

Dr Michel Zaki Guirguis 09/11/2016Bournemouth University1

Institute of Business and LawFern BarrowPoole, BH12 5BB, UKTel:0030-210-9841550Mobile:0030-6982044429

1 I have left from Bournemouth University since 2006. The permanent address of the author’s is, 94, Terpsichoris road, Palaio – Faliro, Post Code: 17562, Athens – Greece.

8

Page 9: Introduction to Econometrics 0

Table of contents

Introduction and definition of Econometrics 10

Quick introduction to probability distributions, chi-square test, Chebyshev’s theorem and matrix algebra 13

Linear regression 69

Multiple regression 212

Heteroskedasticity 291

Autocorrelation 395

Multicollinearity 436

Errors in variables 453

Vector autoregressions, cointegration and causality in terms of model selection and specification 455

Time - series analysis 546

Simultaneous equations models 589

Confirmatory data analysis or inferential statistics 655

Truncated, dummy variables and panel data analysis 695

Resampling methods 765

9

Page 10: Introduction to Econometrics 0

Introduction and definition of Econometrics

Econometrics is a mixed discipline that requires different degrees of introductory knowledge related to probability, probability distributions, statistics, mathematical economics, calculus, and matrix algebra. For example, random variables are described by probability distributions such as the normal distribution, the t-distribution, the Chi-squared distribution, the F – distribution. So, please make yourself familiar with these distributions. I have kept the level of matrix algebra to an introductory level and I have avoided proof theorem, as the things will get complicated.

My objective is to teach this subject in a relaxed way by incorporating the mathematical formulas and the concepts with the statistical software EViews 6. We would like in an easy way to teach this subject for beginners by filling knowledge gaps and at the same time make them happy to participate and learn new concepts. It will be very helpful to read my books related to mathematical economics, probability, statistics and discrete mathematics before taking this course. In case that you have knowledge gaps, I will print extra handouts to eliminate the gaps and keep the level of understanding in a comfortable and happy way. The most important is that you are happy when learning new academic concepts. We don’t want a disappointing audience that is lost or have knowledge gaps.

Econometrics is basically related to economic measurement. It is the application of statistics and mathematics to economic or financial data to obtain and interpret numerical results. Econometricians are economic and financial analysts that are interested in numerical estimation of the relationship between economic or financial variables. It is a discipline that goes beyond expressing only economic or financial theory through mathematical equations. It is focused on the empirical estimation of these mathematical equations. Therefore, it will be a good idea to have a look to my book related to probability and statistics and mathematical economics. An econometrician could be asked to estimate the demand, supply function or price elasticities of different products and test empirically the relationships between different products. In addition, the econometrician will be asked in a company or federal bank setting to forecast economic or financial variables. As an example, we could mention short, medium and long – term interest rates, monthly sales, money supply, inflation and unemployment figures. For example, money supply is the sum of monetary assets in an open economy. It is affected by changes in interest rates, inflation, exchange rates and the business cycle. Another example is the relationship of the price of a product in relation to the quantity demanded of a product, income of the household and prices of other products. In this case, the dependent variable is price and the independent variables are quantity, income and prices of other products.

Specifically, the econometrician is initially taking the collected economic or financial data from the statistician. For example, data that are related to consumption, share prices, disposable income, interest rates, money supply, unemployment rates, inflation, gross domestic product,GDP, etc… The types of quantitative data that are available are time - series, cross – sectional, pooled data and panel data. Time – series data are collected over a period of time. For example, stock prices could be collected in a daily, weekly, monthly or annually. Unemployment rate is collected at a monthly basis and GDP is collected at a quarterly basis. Time – series data could be qualitative for example male or female, employed or unemployed, etc… These data

10

Page 11: Introduction to Econometrics 0

are known as dummy or categorical variables and take the values of 0 or 1 when used in data analysis. Cross – sectional data are collected at a particular point in time. For example, the survey of consumer expenditures is done at a particular point in time or opinion polls before presidential election or referendum. Finally, pooled data is a combination of time – series and cross - sectional data. For example, data on two different companies for a specific time period are cross – sectional. On the other hand, data on the consecutive two years are time – series data. Panel data are known as longitudinal data or cross sectional data. These data are obtained from firms or households at periodic intervals to obtain information that will be used for further analysis. In other words, panel data are based on repetition at periodic intervals to obtain a concise picture of the sample under study.

Then, he/she checks if the time series of the data have errors or omissions. The disturbance or error term is used in this case in the mathematical equations as a measurement tool. This tool is then used in the economic or financial equations to detect measurement problems. For example, one basic assumption in multiple regression is that the value of the error term in one period is uncorrelated to its value in any other period. This assumption is used to detect autocorrelation. In other words, it is used to ensure that the dependent variable depends only on the independent variables and not on the error term. This is the well-known Durbin – Watson statistic. We want that the estimates are unbiased and efficient. Therefore, the econometrician is checking violations of the basic model under study. Violations could be heteroskedasticity, autocorrelation, multicollinearity, truncated variables, errors in variables, etc…. Once these violations have been resolved, then, he/she checks the model and hypothesis adequacy and forecasting related to the econometric model that was initially created. Thus, financial or economic theory is used to prove causation between the variables.

Statistics refers to the collection, presentation, analysis, and interpretation of numerical data to make inferences and formulate decisions in an uncertain context. It is divided into descriptive and inferential statistics. Econometrics refers to the application of economic theory to the economic variables or financial theory to financial data through statistical techniques in order to test hypotheses and estimate their coefficients in order to forecast the future value of the variables. The relationships of the economic or financial variables are erratic and not deterministic and, therefore, we use the residuals or the error term. We want to test the relationship of the dependent variable in relation to the independent variables and the error term. The residual or error term is well defined in a stochastic equation and checked to make sure that measurement is consistent, and unbiased. We could measure economic variables such as gross national product, employment, money supply, inflation, interest rates etc. We could also measure financial variables such as stock market indices, dividend yield, net asset value, premium or discount figures, earnings per share, market capitalization, etc.

Econometrics is divided into theoretical and applied econometrics. Theoretical refers to the methods for measurement of economic or financial relationships. Applied econometrics examines the structure of the problem and findings in particular fields of economics or finance. As an example, we can mention demand theory or asset allocation and investment. The findings of a financial or economic theory should be combined with empirical and additional evidence by checking the coefficient

11

Page 12: Introduction to Econometrics 0

estimates and their significance. Significant autocorrelation and heteroskedasticity indicate to us that we need to change the estimation method.

The methodology of econometrics is based on stochastic mathematical equations of the economic or financial variables under study. We then form a priori expectations about the sign and the size of the parameters. But what we mean by a priori expectations. It refers to the sign and the size of the parameters of the model as formulated by the economic theory. If the estimated coefficients do not agree with the economic or financial theory, the model must be revised, amended to add new or exclude variables or rejected. Then, we collect the data of the variables under study and estimate their coefficients through the corresponding model. We finally evaluate the coefficients and draw conclusion based on their p-values and the related distribution function. The steps that are involved in an econometric analysis are as follows:

Economic or financial theory Econometric model based on the stochastic equation of the error term. Data. Priori expectations and information. Estimation of the model. Tests of the hypotheses that we suggested based on the economic or financial

theory. Forecasting of the model.

Simple regression refers to the relationship between one economic or financial variable, the dependent variable, in relation to one or more independent or explanatory variable. When there is only one independent variable, we have simple regression. If we have more than one independent variables, then, we have multiple regression. The error term or residual is included and statistical tests are conducted to verify that is not correlated or affect in a biased way the relationship between the dependent and independent variables.

In terms of assessment, each week there will be one lecture followed by a seminar group. There will be weekly homework assignments. The problems will be followed by their solutions. There will be a mid-term assignment followed by a three-hour exam at the end of the academic year. You are required to answer four questions from the eight. You will be introduced to the statistical package EViews 6. You will be given time – series and cross – sectional data. During the seminar groups, you will be given the opportunity to practice the various estimation techniques that are provided from the statistical package EViews 6. You are also encouraged to use pooled data, which is a combination of time series and cross – sectional data.

Please feel comfortable and enjoy the course in a very relaxed way and thanks for the participation. Good luck in your future career.

12

Page 13: Introduction to Econometrics 0

Quick introduction to probability distributions, chi-square test, Chebyshev’s theorem and matrix algebra

Describe the concept of probability distribution and understand the difference between a relative frequency distribution.

A probability distribution analyses the proportion of times each value occurs in a data set. Those probabilities are assigned without any experimentation. A probability distribution is often referred to as the theoretical relative frequency distribution. This differs from a relative frequency distribution, which refers to the ratio of the number of times each outcome actually occurs to the total number of observations.

Random variable, discrete and continuous probability distribution

Random is a variable whose values are associated with some probability of being observed. For example, on a 1 roll of a fair die, we have 6 mutually exclusive outcomes (1, 2, 3, 4, 5 or 6), each associated with a probability occurrence of 1/6.

Thus the outcome from the roll of a die is a random variable.

Random variables can be discrete or continuous.

Random variables which can assume a countable number of values are called discrete, while those which can assume values corresponding to any of the infinite number of points contained in one or more intervals are called continuous. Random variables are described by a probability distribution. Thus, probabilities associated with discrete variables are called discrete probability distribution. In contrast, probabilities associated with continuous variables are called continuous probability distribution or probability density function. The total area under the probability function equals to 1. We also have a cumulative distributive function which shows the probability that a random variable X is less than a value x denoted by F(x). F(x) is the cumulative distribution function, (CDF). The derivative of the cumulative distribution function is the probability density function, (PDF) so that f(x) ≥ 0.

Probability (α≤x≤b)=∫a

bf ( x )dx=F (b)−F( a)≥0 . The probability of a random

variable will be between 0 and 1 and the sum of the probabilities of all numerical values of X equals to 1.

Example

What is the probability distribution of throwing a die.

13

Page 14: Introduction to Econometrics 0

Possible outcomes Probabilities1 1/623456Total 1

The above is a discrete probability distribution

Solution

Possible outcomes Probabilities1 1/62 1/63 1/64 1/65 1/66 1/6Total 1

Please consider another example related to the annual return on a derivatives options portfolio.

Return on options portfolio in %. This is the random variable X

Probability of return Cumulative probability

8 0.10 0.1010 0.12 0.10 + 0.12 = 0.2215 0.22 0.22 + 0.22 = 0.4422 0.30 0.44 + 0.30 = 0.7431 0.26 0.74 + 0.26 =1Total 1

Based on the above table of a probability distribution of a random variable, calculate the following probabilities.

Calculate that the probability of return that the random variable X is less than 22%.P(X < 22%) = 0.12 + 0.10 = 0.22.

Calculate that the probability of return that the random variable X is greater than 30%.P(X > 30%) = 0.26.

14

Page 15: Introduction to Econometrics 0

Calculate that the cumulative probability of return that the random variable X is less than 0.74.P(X < 74%) = 0.44 + 0.22 + 0.10 = 0.76.

Please consider a discrete random variable associated with a probability distribution and a cumulative distribution.

Discrete random variable Probability of occurrence Cumulative function2 0.40 0.403 0.40 0.807 0.10 0.908 0.10 1

Total 1

What will be the probability of occurrence of the following discrete random variable.

P(2≤X≤7 )=0 .40+0. 40+0 .10 = 0 . 90

P(X<7) = 0.40 + 0.40 = 0.80

P(X<3) = 0.40

15

Page 16: Introduction to Econometrics 0

Expected value, variance and standard deviations of a probability distribution of a random variable

The expected value is the first moment of the probability distribution. The expected value of a random variable X is the weighted average from the multiplication of individual numerical values with their related probabilities. The mathematical formula is as follows:

E( X )=∑i=1

n

X i P( X i )

The variance is the second moment of the probability distribution. It is calculated as the weighted average of the squared differences between the actual and expected values of the random variable X. It is called the weighted average as the weights represent individual probabilities of each numerical value. The mathematical equation is as follows:

σ x2=∑

i=1

n

[ X i−E( x ) ]2P ( X i )

The standard deviation is the square root of the variance.

σ x=√σ x2

Let’s consider an example to better understand how the formulas are working in practice.

It is required to calculate the expected return or mean, the variance and the standard deviation of futures returns based on the following probability distribution table.

Futures return of a random variable Xi expressed in %

Probability of return P(Xi)

8 0.102 0.304 0.5010 0.10Total 1

16

Page 17: Introduction to Econometrics 0

Solution

The expected value is calculated from the following equation:

E( X )=∑i=1

n

X i P( X i )

E(X) = 0.08 * 0.10 + 0.02 * 0.30 + 0.04 * 0.50 + 0.10 * 0.10 = 0.008 + 0.006 + 0.02 + 0.01 = 0.044 or 4.4 %

Xi P(Xi) Xi – E(X) [Xi-E(X)]2 P(Xi)8 0.10 8 – 4.4 = 3.6 (3.6)2 * 0.10 =

1.2962 0.30 2 – 4.4 = -2.4 (-2.4)2 * 0.30

=1.7284 0.50 4 – 4.4 = -0.4 (-0.4)2 * 0.50 =

0.0810 0.10 10 – 4.4 = 5.6 (5.6)2 *0.10 =

3.136Total 1 6.24

The variance and the standard deviation are calculated as follows:

σ x2=∑

i=1

n

[ X i−E( x ) ]2P ( X i )

σ x2=6 . 24 %

The standard deviation is the square root of the variance .

σ x=√6 .24=2 .50 % ( to 2 .d . p . ).

17

Page 18: Introduction to Econometrics 0

Exercise

The probability distribution from the sales of second hand cars nominated as X variable is as follows:

Xi Probability100 0.02200 0.05300 0.10400 0.25500 0.15600 0.23700 0.12800 0.04

The car agency pays 5,000 Euro for each car and sells it for 5,500 Euro. Thus, the agency makes a profit of 500 Euro. For unsold cars the agency looses 5,000 Euro. Calculate the expected value, the variance and the standard deviation of the profit if the number of ordered cars is between 100 - 500, 100 - 600 and 100 - 700?

Solution

The expected value for the ordered cars between 100 - 500 is calculated from the following equation:

E( X )=∑i=1

n

X i P( X i )

Xi Probability100 0.02200 0.05300 0.10400 0.25500 0.15

E(X) = 0.02 * 100 + 0.05 * 200 + 0.10 * 300 + 0.25 * 400 + 0.15*500 =

E(X) = 2 + 10 + 30 + 100 + 75 = 217

18

Page 19: Introduction to Econometrics 0

Xi P(Xi) Xi – E(X) [Xi-E(X)]2 P(Xi)100 0.02 100 – 217 = -117 (-117)2 * 0.02 = 273.78200 0.05 200 – 217 = -17 (-17)2 * 0.05 = 14.45300 0.10 300 – 217 = 83 (83)2 * 0.10 = 688.9400 0.25 400 – 217 = 183 (183)2 *0.25 =8372.25500 0.15 500 – 217 = 283 (283)2 * 0.15 = 12013.35Total 21362.73

The variance and the standard deviation are calculated as follows:

σ x2=∑

i=1

n

[ X i−E( x ) ]2P ( X i )

σ x2=21362. 73

The standard deviation is the square root of the variance .

σ x=√21362.73=146 .16 ( to 2 .d . p .) .

The expected value for the ordered cars between 100 - 600 is calculated from the following equation:

E( X )=∑i=1

n

X i P( X i )

Xi Probability100 0.02200 0.05300 0.10400 0.25500 0.15600 0.23

E(X) = 0.02 * 100 + 0.05 * 200 + 0.10 * 300 + 0.25 * 400 + 0.15*500 + 600 *0.23 =

E(X) = 2 + 10 + 30 + 100 + 75 + 138 = 355

19

Page 20: Introduction to Econometrics 0

Xi P(Xi) Xi – E(X) [Xi-E(X)]2 P(Xi)100 0.02 100 – 355 = -255 (-255)2 * 0.02 = 1300.5200 0.05 200 – 355 = -155 (-155)2 * 0.05 = 1201.25300 0.10 300 – 355 = -55 (-55)2 * 0.10 = 302.5400 0.25 400 – 355 = 45 (45)2 *0.25 = 506.25500 0.15 500 – 355 = 145 (145)2 * 0.15 = 3153.75600 0.23 600 – 355 = 245 (245)2 * 0.23 = 13805.75Total 20270

The variance and the standard deviation are calculated as follows:

σ x2=∑

i=1

n

[ X i−E( x ) ]2P ( X i )

σ x2=20270

The standard deviation is the square root of the variance .

σ x=√20270=142 .37 ( to 2 . d . p . ).

The expected value for the ordered cars between 100 -700 is calculated from the following equation:

E( X )=∑i=1

n

X i P( X i )

Xi Probability100 0.02200 0.05300 0.10400 0.25500 0.15600 0.23700 0.12

E(X) = 0.02 * 100 + 0.05 * 200 + 0.10 * 300 + 0.25 * 400 + 0.15*500 + 600 *0.23 + 700 * 0.12 =

E(X) = 2 + 10 + 30 + 100 + 75 + 138 + 84 = 439

20

Page 21: Introduction to Econometrics 0

Xi P(Xi) Xi – E(X) [Xi-E(X)]2 P(Xi)100 0.02 100 – 439 = -339 (-339)2 * 0.02 = 2298.42200 0.05 200 – 439 = -239 (-239)2 * 0.05 = 2856.05300 0.10 300 – 439 = -139 (-139)2 * 0.10 = 1932.1400 0.25 400 – 439 = -39 (-39)2 *0.25 = 380.25500 0.15 500 – 439 = 61 (61)2 * 0.15 = 558.15600 0.23 600 – 439 = 161 (161)2 * 0.23 = 5961.83700 0.12 700 – 439 = 261 (261)2 * 0.12 = 8174.52Total 22161.32

The variance and the standard deviation are calculated as follows:

σ x2=∑

i=1

n

[ X i−E( x ) ]2P ( X i )

σ x2=22161. 32

The standard deviation is the square root of the variance .

σ x=√22161.32=148 .87 ( to 2 .d . p .) .

21

Page 22: Introduction to Econometrics 0

Bivariate probability distribution

Bivariate probability distribution function is related to the joint probability of two variables. If we have one variable, then, we are discussing about univariate probability distribution function. Multivariate probability distribution is related to more than two variables. For simplicity, we will study the frequency distribution and the bivariate probability distribution of two variables.

For example, let’s consider the frequency distribution of the sales of cars and motocycles. Two bivariate variables represented by X and Y.

The frequency distribution of two random variables X and Y Number of cars sold (X) 2 3 4 TotalNumber ofmotocyclessold(Y)

2 4 5 7 163 2 3 8 134 7 2 1 10Total 13 10 16 39

We convert this table to a bivariate probability distribution by dividing the number of sales by the total 39. For example, 4 / 39 = 0.10. We repeat the same procedure for the rest of the numerical sales and total figures. Thus, the table will be as follows:

The frequency distribution of two random variables X and Y Number of cars sold (X) 2 3 4 TotalNumber ofmotocyclessold(Y)

2 0.10 0.13 0.18 0.413 0.05 0.08 0.21 0.334 0.18 0.05 0.03 0.26Total 0.33 0.26 0.41 1.00

For example, the probability of the sales of four cars and four motocycles is 0.03. The probability of the sales of 3 cars and 2 motocycles is 0.13. The grand total should add to 1.00.

22

Page 23: Introduction to Econometrics 0

Marginal probability functions

If we have two bivariate variables, the marginal probability of X is the numerical values that is assumed and is irrelevant from the numerical values taken by the variables Y. The distribution of these probabilities for each variable is knows as the marginal probability function of X or Y and their probability sums are equals to one.

X variable f(X) Y variable f(Y)2 0.33 2 0.413 0.26 3 0.334 0.41 4 0.26Total 1 1

The probability that the variable X takes the value 3 is 0.26 which is irrelevant from the value that takes the variable Y. This is known as the marginal probability of X.The marginal probability the Y equals to 4 is 0.26 and the sum of the probability function of Y is equals to 1.

Exercise

Assume the joint probability distribution of two variables X and Y.

Y 1 2 3X1 0.3 0.04 0.32 0.4 0.02 0.43 0.3 0.4 0.2

Calculate the marginal probabilities of X and Y.

Solution

1) P(X=1) = 0.3 +0.04 + 0.3 = 0.64

P(X=2) = 0.4+ 0.02+ 0.4 = 0.82

P(X=3) = 0.3 +0.4+0.2 = 0.9

P(Y=1) = 0.3 + 0.4 + 0.3 = 1

P(Y=2) = 0.04 + 0.02+ 0.4 = 0.46

P(Y=3) = 0.3 + 0.4 + 0.2 = 0.9

23

Page 24: Introduction to Econometrics 0

t-distibution

For a sample size that is less than 30, we use the t-distribution. For example, the sample size of cars sold is 10. The average number of sales is 25 and the sample standard deviation is 3. What is the probability of a sale if we assume that the true average is 15?

t= x−μs/√n

Where : x is the mean . μ is the assumed true average . s is the sample standard deviation . n is the sample size .

t=25-153/ √10

=103 /3. 16 =

100 . 949 =10 . 54

We state the hypothesis as follows:H0 : The true average is 15.H1 : The true average is not equal 15 .

The degree of freedom are 10 -1 = 9Then, we check the statistical table of the t-distribution in the appendix .a t- critical value of 2 .262 with 5% significance level .Therefore, t-statistic 10 .54 is greater then 2 . 262 and we reject H0 .In other words, the true average is not equal 15 .

In Excel, I will show how to perform a t-test of two sample assuming equal variances. Please plot the following data in Excel and then press tools…………. data analysis………. and select a t-test of two sample assuming equal variances, and a t-test of two sample assuming unequal variances.

The dataset is as follows:

x1 40 20 10

x2 30 40 50

24

Page 25: Introduction to Econometrics 0

t-Test: Two-Sample Assuming Equal Variances

x1 x2

Mean23.3333

3 40

Variance233.333

3 100Observations 3 3

Pooled Variance166.666

7Hypothesized Mean Difference 0df 4t Stat -1.58114

P(T<=t) one-tail0.09450

2

t Critical one-tail2.13184

6

P(T<=t) two-tail0.18900

4

t Critical two-tail2.77645

1

The statistics that we are interested in are the mean, the variance, the number of observations, the t – statistics and the t critical one and two tail. We state the hypotheses as follows:

H0: The two samples have equal variances.H1: The two samples have not equal variances.

Then, you compare the t-statistics with the t-critical one and two tail. In our case, the t-statistics -1.58 < 2.13 for one tail and the t-statistics -1.58 < 2.78 for two tail test. Therefore, the sample evidence suggests that we cannot reject H0.

In Excel, I will show how to perform a t-test of two sample assuming unequal variances.

The dataset is as follows:

x1 40 20 10

x2 30 40 50

t-Test: Two-Sample Assuming Unequal Variances

x1 x2

Mean23.3333

3 40

Variance233.333

3 100Observations 3 3

25

Page 26: Introduction to Econometrics 0

Hypothesized Mean Difference 0df 3t Stat -1.58114

P(T<=t) one-tail0.10599

3

t Critical one-tail2.35336

3

P(T<=t) two-tail0.21198

5

t Critical two-tail3.18244

9

The statistics that we are interested in are the mean, the variance, the number of observations, the t – statistics and the t critical one and two tail. We state the hypotheses as follows:

H0: The two samples have unequal variances.H1: The two samples have not unequal variances.

Then, you compare the t-statistics with the t-critical one and two tail. In our case, the t-statistics -1.58 < 2.35 for one tail and the t-statistics -1.58 < 3.18 for two tail test. Therefore, the sample evidence suggests that we cannot reject H0. The two samples have unequal variances.

F - distribution

Let's assume that we have 6 observations and two independent variables .The F statistic is equals to the ratio of the sum of squares of the explained variation, MSR=104 .6348 dividedby the sum of squares of the unexplained variation, MSE=10 . 91 .

F =MSRMSE

=104 .634810 . 91

=9. 59 ( to 2 . d . p .) .

The F statistic has a null and alternative hypothesis.H 0 :b=0H 1 :b≠0

26

Page 27: Introduction to Econometrics 0

The degrees of freedom are as follows:d f numerator=k=2 This is because we have 2 independent variable .d f deno min ator=n−k−1=6−2−1=3

The critical F-value for 2 and 3 degrees of freedom at the 95% confidence level is 9 .55 .

The critical F-value for 2 and 3 degrees of freedom at the 95% confidence level is 9 .55 . Since,F=9 . 59 is greater than 9 .55, the results are significant at the 5% significance level .The sampleevidence suggests that we can reject H0 and conclude that the slope coefficient is significantlydifferent from zero . This means that the independetn variable contribute significantly to explain thedependent variable .

Let’s take another example to show the F-test of significance of two samples drawn from two variables x and y. The sample size is 20.The variance for x is 100.35 and the variance for y = 50.56. The F-test is the ratio of the variances of the two samples. The mathematical formula is as follows:

F=sx

2

s y2 =100 . 35

50 . 56=1 . 98

The F value will be compared with the F critical value from the table in the appendix of the book. The degree of freedom for the nominator and denominator are as follows:df numerator = k =1df denominator = n – k – 1 = 20 – 1 – 1 = 18

The critical F – value for 1 and 18 degrees of freedom at the 5% significance level is 4.41. The F value = 1.98 which is smaller than 4.41. The sample evidence suggests that the result is not significant. We could not reject the null hypothesis that the two sample variances are the same.

Exercise

A financial adviser wants to compare mean returns and risk measured by a variance of two stocks. He gets the following results:

Stock 1 Stock 2n1 = 4 n2 = 4Numerical values of returns Numerical values of returns0.10 0.200.20 0.220.22 0.250.15 0.30Numerical values of risk Numerical values of risk0.50 0.100.55 0.15

27

Page 28: Introduction to Econometrics 0

0.60 0.200.45 0.25

We assume that intraday price changes are normally distributed. Are there any significant differences in the mean returns and risks? Please state which tests are you going to use?

Solution

We are going to use the t-test to test any significant differences in the mean returns.We are testing the following hypotheses:

H0: μ1=μ2

H1: μ1≠μ2

t-Test: Paired Two Sample for Means

Stock 1, mean returns

Stock 2, mean returns

Mean 0.17 0.24Variance 0.003 0.002Observations 4.00 4Pearson Correlation 0.22Hypothesized Mean Difference 0.00df 3.00t Stat -2.44P(T<=t) one-tail 0.05t Critical one-tail 2.35P(T<=t) two-tail 0.09t Critical two-tail 3.18

The second test that we are going to use is the F-test to test if there are any significant differences in the risks between the two stocks measured by the variance.

We are testing the following hypotheses:

28

Page 29: Introduction to Econometrics 0

H0: s12=s2

2

H1: s12≠s2

2

F-Test Two-Sample for Variances

Stock 1, variance Stock 2, varianceMean 0.525 0.175Variance 0.004 0.004Observations 4 4df 3 3F 1P(F<=f) one-tail 0.5F Critical one-tail 0.11

Hint: Please comment if there is significant difference based on the previous examples. Compare the t and F – statistic with the t and F- critical region. Accordingly reject or accept the null hypothesis.

Chi-square test

Description of the chi-square test

The hypothesis tests we have looked at so far are all parametric tests. This means that they make an assumption about the distribution of the population we are sampling from. In addition, hypothesis testing concentrates on the mean of the population. On the other hand, non parametric tests such as the Chi-square tests do not make any such assumptions. As the name may suggest, the statistic calculated involves squaring values, and thus the result can only be positive.

Although, a non-parametric test is still a hypothesis test, but rather than considering just a single parameter of the sample data, it looks at the overall distribution and compares this to some known or expected value, usually based upon the null hypothesis.

We shall look at two particular applications of the chi – square (x2) test. The first considers whether a particular set of data follows a known statistical distribution. This type of test is known as goodness-of-fit. Secondly, we will use a chi-square test to consider survey data, usually from questionnaires. From the responses to the questions, it is possible to construct tables which show how the responses to one question relate to the responses to another. This type of test is known as tests of independence or association

29

Page 30: Introduction to Econometrics 0

The concept of degrees of freedom (v)

As we saw for large samples n 30 we used the normal distribution. The shape of the distribution actually changes as the sample size changes. We must therefore apply different probabilities to samples taken with different sizes. The concept of degrees of freedom captures this idea. The number of degrees of freedom in general is expressed as n - 1, where n is sample size. The critical value (number of standard errors from the mean) vary depending on the value obtained from the degree of freedom.

Goodness – of- fit tests

The x2 - statistic may be used as a test of goodness-of-fit in comparing an observed distribution with an expected distribution.

In effect, this is a method for deciding whether a set of data follows a particular distribution. The test requires some observed values (O) and the ability to calculate some expected values which would apply if the data did follow the assumed distribution (E). In this application, the test statistic is given by:

c2=∑ (O−E )2

E

Where:

O = observed values

E = expected values

∑ = All the cells in the table are summed

30

Page 31: Introduction to Econometrics 0

The degrees of freedom which determine the critical values are given by (k - 1) in a goodness-of-fit test, where k is the number of observed and expected values.

The method may be summarised as follows:

1. Formulate the null and alternative hypotheses.

2. Select a significance level (say, 5 per cent), determine the degrees of freedom (k - 1) and look up the critical value from the x2 table at the end of your handout.

3. Calculate the expected values on the assumption that the null hypothesis is true.

4. Calculate the test statistic and compare with the critical value. If the calculated value exceeds the critical value, reject the null hypothesis. Otherwise, do not reject the null hypothesis.

Exercise

Use a chi – square test at the 95% confidence level to test that the following test scores are normally distributed.

Test scores in 10 Frequency1 12 23 54 65 86 107 78 209 1510 20

Total 94

It is required at the 5% significance level to test whether the test scores distribution is normally distributed?

1. State the hypotheses:

H0:

31

Page 32: Introduction to Econometrics 0

H1:

2. The significance level will be taken as 5%. The degree of freedom will be ……….

3.

4.

c2=∑ (O−E )2

E

O E (O - E) (O – E)2 (O – E)2 / E

x2 =Solution

1. State the hypotheses:

H0: The test scores distribution is normally distributed.H1: The test scores distribution is not normally distributed

2. The significance level will be taken as 5% significance level two-sided test. The degree of freedom will be k – 1 = 10 -1 = 9. Thus, the critical value is 16.92.

3. The expected frequency is the total divided by the number of test scores. Thus, we have 94 / 10 = 9.4

4.

c2=∑ (O−E )2

E

O E (O - E) (O – E)2 (O – E)2 / E1 9.4 -8.4 70.56 7.512 9.4 -7.4 54.76 5.833 9.4 -6.4 40.96 4.364 9.4 -5.4 29.16 3.10

32

Page 33: Introduction to Econometrics 0

5 9.4 -4.4 19.36 2.066 9.4 -3.4 11.56 1.237 9.4 -2.4 5.76 0.618 9.4 -1.4 1.96 0.219 9.4 -0.4 0.16 0.0210 9.4 0.6 0.36 0.04

Total x2 = 24.96

24.96 > 16.92

Therefore, we reject H0. So, there is evidence that the distribution of the test scores is not normally distributed.

Tests of independence

You will need to have collected frequency data from two situations. The test involves setting up two hypotheses. The null hypothesis (H0) states that the two variables are independent of one another and the alternative hypothesis (H1) states that the two variables are associated with one another. The null hypothesis is always stated first. The x2 test allows you to find whether there are any statistically significant differences between the actual (observed frequencies and hypothesised (expected) frequencies.

Tests of independence are common in analysing questionnaire results. From the responses to the questions, it is possible to construct tables which show how the responses to one question relate to the responses to another. These tables are called cross-tabulations or contingency tables.

The method is as follows:

1. Formulate the null and alternative hypotheses. The null hypothesis is that there is no association between the two sets of responses (i.e. they are independent).

2. Select a significance level (say, 5 per cent).

33

Page 34: Introduction to Econometrics 0

3. Determine the degrees of freedom (r - 1)(c - 1) where r is the number of rows in the contingency table and c is the number of columns. Look up the critical value.

4. Calculate the expected values on the assumption that the null hypothesis is true.

[To calculate the expected values for each cell in the table, calculate the ‘row total’, divided by the ‘grand total’, multiplied by the ‘column total’.]

5. Calculate the test statistic (same formula as in ‘goodness-of-fit’s test’) and compare with the critical value. If the calculated value exceeds the critical value, reject the null hypothesis. Otherwise, do not reject the null hypothesis.

Exercise

Let’s consider the annual income in Pounds between stockbrokers working in two different companies.

25,000 40,000 TotalsStockbrokers in

company A100 80 180

Stockbrokers in company B

150 90 240

Totals 250 170 420

Determine whether there is any relationship between the annual income ofstockbrokers working in two different companies. Use the 5% level of significance.

1. H0:

H1:

2. The significance level is 5%

3. The contingency table has two rows and two columns. The degree of freedom are therefore

4. Contingency table and calculation of expected frequencies (E).

34

Page 35: Introduction to Econometrics 0

The expected frequency is calculated as

Row total x column totalTable totals

5. A chi-square statistic x2 is calculated y taking the difference between each observed result and the corresponding expected result, squaring the difference, and dividing this square by the expected result. All the figures found are then added up.

c2=∑ (O−E )2

E

Solution

Determine whether there is any relationship between the annual incomes ofstockbrokers working in two different companies. Use the 5% level of significance.

6. H0: There is no association between the annual incomes of stockbrokers working in two different companies. Or the annual incomes of stockbrokers working in two different companies are independent.

H1: There is such association or there is such dependence.

7. The significance level is 5%

8. The contingency table has two rows and two columns. The degree of freedom are therefore (2 - 1) (2 – 1) = 1 x 1 = 1.

9. Contingency table and calculation of expected frequencies (E).

The expected frequency is calculated as

Row total x column totalTable total

35

Page 36: Introduction to Econometrics 0

25,000 40,000Stockbrokers in

company A107.1429 72.8571

Stockbrokers in company B

142.8571 97.1429

A chi-square statistic x2 is calculated y taking the difference between each observed result and the corresponding expected result, squaring the difference, and dividing this square by the expected result. All the figures found are then added up.

c2=∑ (O−E )2

E

O E (O-E) (O-E)2 (O-E)2 / EStockbrokers in company A, (25,000)

100 107.1429 -7.1429 51.0210 0.4762

Stockbrokers in company B, (25,000)

150 142.8571 7.1429 51.0210 0.3571

Stockbrokers in company A, (40,000)

80 72.8571 7.1429 51.0210 0.7003

Stockbrokers in company B, (40,000)

90 97.1429 -7.1429 51.0210 0.5252

Total x2 = 2.06

From the x2 table, the critical value at the 5% level of significance or 95% confidence level, when there is 1 degree of freedom is 3.84.

The value of x2 is 2.06 is below the critical value. In other words, 2.06 < 3.84. We therefore, accept the null hypothesis that there is no association between the annual incomes of stockbrokers working in two different companies.

36

Page 37: Introduction to Econometrics 0

Exercise

A statistician was assigned a task to check if the variability of sales of two computer businesses is the same at different locations. He has gathered two random samples with 10 observations each. Thus, n1 = n2 = 10. I have attached the relevant data set that represents variances.

Shop in first location, ( sales expressed in Euro).

Shop in second location, (sales expressed in Euro).

190 290200 300205 305210 310215 315220 320225 325230 330235 335240 340

It is required to check if the variability in sales is different in the two locations.In addition, it is given that the market variability of the computer business measured by the standard deviation is 265 Euro. Variability that is greater than 265 Euro signifies that the business is badly managed. The expected value is 262. Please test based on your data set if there is such evidence at the 5% significance level. Please state which tests are you going to use and the relevant hypotheses.

37

Page 38: Introduction to Econometrics 0

Solution

First of all, use the F-test to check equality of variances.

We are testing the following hypotheses:

H0: s12=s2

2

H1: s12≠s2

2

F-Test Two-Sample for Variances

Shop in first location Shop in second location

Mean 217 317Variance 256.67 256.67Observations 10 10df 9 9F 1P(F<=f) one-tail 0.5F Critical one-tail 0.31

Hint: Please comment if there is significant difference based on the previous

examples. Compare the F and χ2

– statistic with the F and χ2

critical region. Accordingly reject or accept the null hypothesis.

To answer the second part of the question, it is required to use the Chi-square statistic to test if the standard deviationσ=265 Euro . The observed is 265 Euro and the expected value is 262.

χ2=∑ (Ο−E )2

E

χ 2=(265−262 )2

262=0. 034( to 3. d . p .).

The degrees of freedom are k-1 or 10 – 1 = 9 and the critical region at the 5% significance level is 16.92.

The hypotheses are as follows:

H0: σ=265 EuroH1: σ ≠ 265 Euro

0.034 < 16.92

38

Page 39: Introduction to Econometrics 0

Therefore, we cannot reject H0. There is no enough evidence to accept the alternative hypothesis.

Define what is meant by a continuous variable and give some examples

A continuous variable is one that can assume any value within any given interval. For example, if we say that a production process takes 2 to 3 hours. This means anywhere between 2 to 3 hours. Time is thus a continuous variable, and so are weight, distance, and temperature.

Define what is meant by a continuous probability distribution

A continuous probability distribution refers to the range of all possible values that a continuous random value can assume, together with the associated probabilities. The probability distribution of a continuous random variable is often called a probability density function or probability function. It is given by a smooth curve such that the total area probability under the curve is 1.

What is a normal distribution and what is its usefulness?

A very important continuous probability distribution is the normal distribution. It has a bell shaped and symmetrical about the mean, median, and mode.

The normal distribution is the most commonly used of all probability distributions in statistical analysis.

Characteristics of the normal distribution

This distribution deals with continuous data

39

Page 40: Introduction to Econometrics 0

It is symmetrical, therefore 50% or 0.5 of outcomes have a value greater than the value mean, and 50% or 0.5 of outcomes have a value less than the mean value. The area under the curve totals exactly 1 or 100%

Under the normal probability distribution the area under the curve are located between plus and minus any given number of standard deviations from the mean.

The central point is the mean, median and mode It has a bell-shape It can be summarised by its mean and standard deviation. Thus knowledge of

and σ enables us to draw the exact normal curve associated with a random variable.

Areas under the curve represent probabilities.

Sketch a normal curve

What is the standard normal curve and what is its usefulness?

The standard normal curve is a normal distribution with =0 and standard deviation of one. It is used to solve the main problem of how to compute the probability which is measured by the area under the normal curve. This can be done using the tabulated values given in statistical tables.

Every normal distribution can be converted to the standard normal distribution by means of a simple transformation, which changes the original x-values into corresponding standard normal z values and looking up these z values in standard normal distribution table.

The formula of the z value is:

z= x−μ

σ Where: z the number of standard deviations above or below the mean x is the value of the variable under consideration is the mean of the population and sigma (s) is the standard deviation.

40

Page 41: Introduction to Econometrics 0

To find the area under any normal curve, first perform this transformation, and then use the table to find the equivalent area under the standard normal curve.

Note that z represents the number of standard deviation an x –value is above or below its mean.

Find areas under the standard normal curve

Example

Find the area between z = -1 and z =1 and sketch a graph to illustrate the normal distribution x scale and the conversion into a standard normal distribution z scale.

Solution

The area (probability) included under the standard normal curve for z =1 is obtained by looking up the value of 1.0 in the standard normal table. This is accomplished by moving down the z column in the table to 1.0 and then across until we are below the column headed .00. The value that we get is 0.1587. This means that 15.87% of the total area of 100% or 1 under the curve lies between z =0 and z =1.

Because of symmetry, the area between z=0 and z = -1 is also 0.1587, or 15.87%.

Therefore, the area between z = -1 and z =1 is 1 – 0.3174 = 0.6826

41

Page 42: Introduction to Econometrics 0

x scale - ∞ - σ +σ Normal Curve +∞

0.1587 or 0.1587 or 15.87 % 15.87 % z scale / / / -1 0 1 Standard normal Curve 68.26%

Exercises

(1) Find the area under the standard normal curve:(a) to the left of z = - 1.7(b) to the right of z = 2.85(c) to the left of z = - 0.3(d) Between z =1.55 and z = 2.15

(a) 0.0446(b) 0.00219(c) 0.3821(d) (0.0606 – 0.01578) = 0.04482

2) Suppose that x is a normally distributed random variable with =10 and σ =2

and we want to find the probability of x assuming a value between 8 and 12. We first calculate the z values corresponding to the x values of 8 and 12 and then look up these z values.

z1=

x1−μσ

=8−102

=−1

z2=

x2−μσ

=12−102

=1

42

Page 43: Introduction to Econometrics 0

So, z = 1 equals to 0.1587 for the right – hand tail of the normal distribution. Therefore, the area between z = -1 and z =1 is 1 – 0.3174 = 0.6826. The value 0.3174 is calculated as 0.1587 x 2 = 0.3174. The reason is that we want to calculate the two sides of the distribution. This means that the probability of x assuming a value between 8 and 12, or P(8 X 12), is 68.26%

3) Suppose that x is a normally distributed random variable with =15 and

σ =10. We want to find the probability of x assuming a value between 10 and 14. We first calculate the z values corresponding to the x values of 10 and 14 and then look up these z values

z1=x1−μ

σ=10−15

10=−0 . 5

z2=x2−μ

σ=14−15

10=−0 .1

Thus, we want the area (probability) between z1= - 0.5 and z2 = -0.1. From the table we have z1 = -0.5 = 0.3085 and z2 = -0.1 = 0.4602. From the left we have 1 – 0.3085 = 0.6915 and 1 – 0.4602 = 0.5398. Then, we deduct 0.6915 – 0.5398 = 0.1517 or 15.17%. Thus, the P(10 x 14) is 15.17%.

I will add an example on how to perform a z – test of two sample means.

The dataset is as follows:

x1 40 20 10

x2 30 40 50

43

Page 44: Introduction to Econometrics 0

The variances for x1 = 233.33 and for x2 = 100

z-Test: Two Sample for Means

x1 x2Mean 23.33333333 40Known Variance 233.33 100Observations 3 3Hypothesized Mean Difference 0z -1.581146736P(Z<=z) one-tail 0.056922249z Critical one-tail 1.644853476P(Z<=z) two-tail 0.113844498z Critical two-tail 1.959962787

The statistics that we are interested in are the mean, the variance, the number of observations, the z – statistics and the z critical one and two tail.

Then, you compare the z-statistics with the z-critical one and two tail. In our case, the z-statistics -1.58 < 1.64 for one tail and the z-statistics -1.58 < 1.96 for two tail test. Therefore, the sample evidence suggests that we cannot reject H0 of means equality.

Please consider another example related to the share and market returns. Please perform a z – test of two sample means. The variance for share returns is 3.97 and the variance of market returns is 10.48. The variances in Excel are calculated by using the function =var(). Inside the brackets you insert the range of your data. For example,

=var(A2:A31) for the first variable share returns and var =(B2:B31) for the second variable market returns.

1 AShare returns

B

Market returns2 3.526787 8.732093 -4.34533 -5.198154 5.222709 6.218655 -4.99619 -5.53936 -3.04336 7.698087 -2.375422 -4.997358 2.651303 5.427779 -0.68924 -1.542410 0.205664 1.463911 2.4783 3.652812 0.237407 -0.1494

44

Page 45: Introduction to Econometrics 0

13 0.329728 0.1668814 -0.26869 -0.144415 0.064769 0.09787316 -0.5873 -0.0991117 0.329225 -0.0834418 -0.11849 0.12276719 0.011541 -0.4576720 -0.18757 -0.5304621 -0.38752 -0.1111822 -0.26835 -0.2894723 0.262798 -0.1767624 0.355054 -1.1568625 -1.34302 -0.577126 -0.77964 0.57818227 -0.04649 -0.0533128 0.098381 -0.2305429 -0.09585 -0.6662530 -0.0059 -0.5007131 -0.05415 -0.53128

The output of the z-test: two sample for means will be as follows:

z-Test: Two Sample for Means

Share returns

Market returns

Mean -0.127290.37079506

7Known Variance 3.97 10.48Observations 30 30Hypothesized Mean Difference 0z -0.71769P(Z<=z) one-tail 0.236475z Critical one-tail 1.644853P(Z<=z) two-tail 0.472951z Critical two-tail 1.959963

The statistics that we are interested in are the mean, the variance, the number of observations, the z – statistics and the z critical one and two tail.

Then, you compare the z-statistics with the z-critical one and two tail. In our case, the z-statistics -0.72 < 1.64 for one tail and the z-statistics -0.72 < 1.96 for two tail test. Therefore, the sample evidence suggests that we cannot reject H0 of means equality.

45

Page 46: Introduction to Econometrics 0

Chebyshev’s theorem

Chebyshev’s has stated a mathematical equation to get results for a sample or population standard deviation more than one. The mathematical formula is as follows:

(1−( 1k )

2)Where K is the the standard deviations from the mean of the random variable X. Let’s take as an example a right skewed distribution. It is required to find the proportion of obersvations ± 2 standard deviations. By using the Chebyshev’s theorem, we can find the proportion of observations within two standard deviations from the mean.

By applying the above formula we have the following results:

(1−( 12 )

2)=1−14=1−0 .25=0 . 75

Therefore, 75% of the observations lies between -2 and +2 standard deviations from the mean.

46

Page 47: Introduction to Econometrics 0

Let’s take as an example a right skewed distribution with mean 3.24 and a standard deviation of 4.27. Please calculate the proportion of observations within ± 4.27.

By applying the above formula we have the following results:

(1−( 14 . 27 )

2)=1− 118 . 2329

=1−0 . 0548=0 . 9452

Therefore, 94.52 % of the observations lies between ± 4.27 standard deviations from the mean.

One - way analysis of variance

Let’s take as an example two variables x1 and x2 and their three corresponding numerical values. Test at the 5% significance level that the two variables have the same mean?

x1 40 20 10x2 30 40 50

First of all we state the hypotheses as follows:

H0: The means of the two variables are equal.H1: The means of the two variables are not equal.

Then, we calculate the sample mean for each variable.

x1=40+20+10

3=70

3=23 .33

x2=30+40+50

3=120

3=40

Then, we calculate the sample mean for both variables.

47

Page 48: Introduction to Econometrics 0

xof both variables=23 .33+40

2=63 .33

2=31. 665

The total square variation = (40 – 31.665)2 + (20 – 31.665)2 + (10 – 3.665)2 +(30 -31.665)2 + (40 – 31.665)2 + (50 – 31.665)2

The total square deviation = 69.472225 + 136.072225 + 40.132225 + 2.772225 + 69.472225 + 336.172225 = 654.09335

The square deviation between rows is as follows:

Square deviation between rows = 3*[(23.33 – 31.665)2 + (40 – 31.665)2] =3*(69.472225 + 69.472225) = 416.83335.

The square deviation due to random error = 654.09335 – 416.83335 = 237.26

The F – test statistic =

416 .83335/1237 .26/4

=416 .8333559. 315

=7 . 03

The F statistic is 7.03 and the critical value for 1 degrees of freedom for the numerator and 4 degrees of freedom for the denominator is 7.71. The degrees of freedom were calculated as follows:

Numerator = rows – 1 = 2 – 1 = 1Denominator = rows x columns – rows = 2 x 3 – 2 = 4

As, 7.03 < 7.71, we cannot reject the null hypothesis.

48

Page 49: Introduction to Econometrics 0

Introduction to matrix algebra

Matrix definition

A matrix is a rectangular array of numbers arranged in rows and columns and is characterized by its size which is given by the number of rows and the number of columns. The whole matrix is usually referred to by capital letter.

The sets of numbers used in algebra

They are subsets of R, the set of real numbers.

Natural numbers NThe counting numbers, e.g, 1,2,3,4,…..

Integer

A member of the set of positive whole number (1,2,3) or negative number (-1,-2,-3) or a zero. But it does not include fractional numbers or numbers with decimal. The set of Z=( -3,-2,-1,0,1,2,3).

Real number

49

Page 50: Introduction to Econometrics 0

All numbers of the set R. It includes rational, irrational, imaginary and complex numbers. The real numbers may be described informally as a number that can be given by an infinite decimal representation such as 2.4871773339…. The real numbers include both rational numbers, such as 42 and −23/129, and irrational numbers such as π.

Rational Numbers The set of all numbers that can be written as quotients a/b, b ≠ 0, a and b integers, e.g., 3 / 19, 10/5, - 7.13,…..

Irrational Numbers All real numbers that are not rational number, e.g, π, √2

Irrational numbers are real number that cannot be reduced to any ratio between an integer p and a natural number q. Natural number is a whole non-negative number.

Imaginary number

A quantity of the form ix, where x is a real number and i is the positive square root of -1.  The term imaginary probably originated from the fact that there is no real number z that satisfies the equation z2 = -1. 

Complex number

A quantity of the form v + iw, where v and w are real numbers, and i represents the unit imaginary numbers equal to the positive square root of -1. The set C of all complex numbers corresponds with the set R of all ordered pairs of real numbers.

Example

A=[1 83 7 ]

Matrix A in terms of order is a 2x2 matrix, and as the number of columns equals the number of rows, it is known as square matrix.

Vectors definition

If a matrix has only one row, then it is known as a row vector. If it has only one column, then it is a column vector.

Example

50

Page 51: Introduction to Econometrics 0

B=[ 5 2 8 ]

The order of matrix B is ( ) and is known as

C=[1085 ]

The order of matrix C is ( ) and is known as

Transpose is a function that converts the rows of the matrix to columns and the columns to rows.

For example, the following matrix

A=¿ [ 4 6 8 ¿ ]¿¿

¿¿

¿¿

A'=¿ [ 4 3¿ ] [ 6 5 ¿ ]¿¿

¿¿

Add and subtract matrices

To add or subtract matrices they must be of the same order or size. In other words, they must have the same number of columns and the same number of rows. When this condition exists the corresponding elements in each matrix are added together and in subtraction the second elements is subtracted from the corresponding elements in the first.

Example of Addition

A=[1 116 2 ]

B=[2 05 9 ]

A+B = [1 116 2 ]+[2 0

5 9 ]51

Page 52: Introduction to Econometrics 0

A + B = [1+2 11+06+5 2+9 ]

A+B=[ 3 1111 11 ]

Another example of adding matrices is as follows:

A=¿ [7 8 ¿ ] ¿¿

¿¿

¿¿

A+B=¿ [7 8 ¿ ]¿¿

¿¿

Exercises

Add the following matrices

1) A = [10 1520 14 ]

B=[21 13

12 17 ]

2) A = [1 48 2 ]

B=[65 ]

Example of subtraction

A=[1 116 2 ]

B=[2 05 9 ]

52

Page 53: Introduction to Econometrics 0

A - B = [1 116 2 ]−[2 0

5 9 ]

A - B = [1−2 11−06−5 2−9 ]

A - B=[−1 11

1 −7 ]Another example of subtracting matrices is as follows:

A=¿ [ 6 8 10 ¿ ] ¿¿

¿¿

¿¿

Exercise

Subtract the following matrices

1) A = [10 1520 14 ]

B=[21 13

12 17 ]

Multiply matrices

There are two aspects of matrix multiplication, the multiplication of a matrix by a single number, called a scalar, and the multiplication of a matrix by another matrix. To multiply two matrices you should check first if it is feasible. This condition is satisfied if the number of columns in the first matrix is equal to the number of rows in the second matrix. If matrix A is of order (a x b) and matrix B is of order (c x d), then for multiplication to be possible, b must equal c and the new matrix produced by the product AB will be of order (a x d)

For example, please calculate the result of the following two matrices.

A = [ 2 5 8 9]

B=¿ [10 ¿ ] [6 ¿ ] [ 4 ¿ ]¿¿

¿¿

53

Page 54: Introduction to Econometrics 0

A*B = 2*10 + 5*6 + 8*4 + 9*3 A*B = 20 + 30 + 32 + 27 = 109

Please consider another example of the following two matrices.

A=¿ [2 5 8 ¿ ]¿¿

¿¿

B=¿ [5 6¿ ] [7 10 ¿ ]¿¿

¿¿ This is 2 x 3 matrix This is 3 x 2 matrix 2 x 3 = 3 x 2

2 x 2Therefore after the multiplication we are going to have a 2 x 2 matrix.

A∗B=¿ [ 2∗5+7∗5+8∗11 2*6+5*10+8*4 ¿ ] ¿¿

¿¿

¿¿

Exercise

A=[327

144 ]

This is a 3 x 2 matrix

B=[ 8 0 5 43 2 11 1 ]

This is a 2 x 4 matrix Required

Calculate A x B

First of all, please, check if it is feasible the calculation . This is a (3 x 2) matrix multiplied by a (2 x 4) matrix.

54

Page 55: Introduction to Econometrics 0

If it is feasible multiply 1st element in 1st row in A by 1st element in 1st column in B (i.e 3 x 8)

Multiply 2nd element in 1st row in A by 2nd element in 1st column in B (i.e 1 x 3)

This multiplication process would be continued until the nth element in first row of the first matrix had been multiplied by the n element in the first column of the second matrix.

All these products are added to give the first element in first row and first column of the new matrix AB. Example (3 x 8) + (1 x 3) =27

Example of a square matrix

A square matrix with the same number of rows and columns is multiplied by itself.

A=¿ [2 5 ¿ ]¿¿

¿¿

A2=[ 2 5 ¿ ] ¿¿

¿¿*

[ 2 5 ¿ ] ¿¿

¿¿=

[ 2∗2+5∗3 2*5 + 5*4 ¿ ]¿¿

¿¿

Multiplication of a vector with a matrix

Multiply a vector with a 3 x 3 matrix.

[ 2 , -3, 5 ]¿ [ 2 4 -2 ¿ ] [0 3 3 ¿ ]¿¿

¿

Scalar Multiplication

A scalar is an ordinary number such as 2,3,10 etc. The rule for this is simple. Multiply each element in the matrix by the scalar. For example:

55

Page 56: Introduction to Econometrics 0

Let A = [5 28 3 ]

The scalar is 2. It is required to find the result of multiplying the scalar with the matrix.

A=2∗¿ [ 5 2 ¿ ]¿¿

¿¿

Exercise

Let A = [7 410 2 ]

And it is required to find 4 x A. The scalar is 4.

[Complete the calculation]

Unity matrix

In matrix algebra unity is any square matrix whose top left to bottom right diagonal consists of 1s where all the rest of the matrix consists of zeros. This matrix is given by the symbol I thus:

I=[1 00 1 ]

Matrices are only equal where they are the same size and have the same elements.

As with normal numbers where a number multiplied by one equals itself (3 x 1 = 3) so with matrices. A matrix example B multiplied by the unity matrix equals itself.

BI = B

56

Page 57: Introduction to Econometrics 0

B=[1 62 3 ]

Prove that BI=B

Determinants

The determinant is a scalar and it is found by multiplying the two elements of the principal diagonal and subtracting them from the multiplication of the elements of the other diagonal.

If A = [a bc d ]

The determinant |A|=ad−cb

For example, please calculate the determinant |A| for the following matrices.

A=¿ [ 8 10 ¿ ] ¿¿

¿¿

¿¿

B=¿ [ 20 -7 ¿ ] ¿¿

¿¿

¿¿

Another example is to find the determinant of more complicated matrices

B=¿ [ 4 3 2 ¿ ] [5 7 8 ¿ ] ¿¿

¿¿ OR

B=¿ [ a11 a12 a13 ¿ ] [b21 b22 b23 ¿ ]¿¿

¿¿

The determinant |B| = a11¿

|b22 b23 ¿|¿¿

¿

57

Page 58: Introduction to Econometrics 0

¿¿ -3

|5 8 ¿|¿¿

¿¿

|B|=4 [(7 x6 )−(3 x 8) ]−3[(5 x6 )−(2 x 8) ]+2[(5 x3 )−(7 x2) ]|B|=4 (18)−3(14 )+2(1)=72−42+2=32

Then, transpose the B matrix and find the determinant.

B'=¿ [ 4 5 2 ¿ ] [ 3 7 3 ¿ ]¿¿

¿¿

|B'|= a11¿

|b22 b23 ¿|¿¿

¿

|B'|= 4 ¿|7 3 ¿|¿

¿¿

|B'|=4 [(7 x 6)−(3 x8 ) ]−5 [(3x 6 )−(2 x3 ) ]+2[ (8 x3 )−(7 x 2) ]|B'|=4(18 )−5 (12 )+2(10)=72−60+20=−32

The conclusion is that the determinant of a matrix equals the determinant of its transpose.

Exercises

Find the determinant of the following C and D matrices.

C=¿ [2 6 8 ¿ ] [5 2 3 ¿ ] ¿¿

¿¿ OR

C=¿ [a11 a12 a13 ¿ ] [b21 b22 b23 ¿ ]¿¿

¿¿

The determinant |C| = a11¿

|b22 b23 ¿|¿¿

¿

58

Page 59: Introduction to Econometrics 0

Please complete the calculations ………………………

Then, transpose the C matrix and find the determinant.

C '=

Please transpose the matrix ………………

|C '|= a11¿

|b22 b23 ¿|¿¿

¿

Please find the determinant of the transpose matrix . ………………………….

The conclusion that you should get is that the determinant of a matrix equals the determinant of its transpose.

59

Page 60: Introduction to Econometrics 0

Exercises

Please find the determinant of the following matrices

A=¿ [3 15 ¿ ]¿¿

¿¿

¿¿

B=¿ [ 2 -4 ¿ ] ¿¿

¿¿

¿¿

C=¿ [ 8 -3 ¿ ]¿¿

¿¿

¿¿

D=¿ [−2 25 ¿ ]¿¿

¿¿

¿¿

Invert square matrices

In matrix algebra the function of division is changed to that of inversion. The inverse (or reciprocal) of a matrix has the same property as that of the inverse of an ordinary number. The inverse of 8 is 1/8 so that

8 x 1/8 = 1

In matrix algebra the inverse of a matrix is denoted by A-1

A x A-1 = I

60

Page 61: Introduction to Econometrics 0

This means that if A is multiplied by its inverse or vice versa the product is a unity matrix.

The inverse of a 2 x 2 matrix can be found as follows.

If A = [a bc d ]

Then, A−1= 1

ad−bc ¿ [ d -b ¿ ]¿

¿¿

Assume that it is required to find the inverse of matrix A

A=[1 23 4 ]

A−1= 1(1 x4 )−(2 x3 )

¿ [ 4 -2 ¿ ] ¿¿

¿

A−1= 1−2

¿ [ 4 -2 ¿ ]¿¿

¿

A−1=¿ [−2 1 ¿ ] ¿¿

¿¿

Idempotent matrix

A matrix A is defined to be idempotent if A2 = A

For example, we consider the following matrix.

A=¿ [ 4 -1 ¿ ]¿¿

¿¿

The next step is to multiply the matrix by itself.

61

Page 62: Introduction to Econometrics 0

A2=¿ [ 4 -1 ¿ ]¿¿

¿¿

We have proved that A is an idempotent matrix as A2 = A.

As another example, please consider the following B matrix: Is it idempotent?

B=¿ [ 1 0 ¿ ]¿¿

¿¿

B=¿ [1 0 ¿ ]¿¿

¿¿

The matrix B is idempotent.

Exercises

Please check if the following matrices are idempotent?

C=¿ [ 2 -3 ¿ ]¿¿

¿¿

D =

[ 4 -1 ¿ ]¿¿

¿¿

Trace of a matrix

The trace of a matrix A is defined as the sum of its diagonal elements.

For example, please consider the following matrix.

A=¿ [3 4 8 ¿ ] [2 5 6 ¿ ]¿¿

¿¿

The trace (A) = 3 + 5 + 4 = 12Please consider the following two matrices A and B.

A=¿ [3 4 8 ¿ ] [2 5 6 ¿ ]¿¿

¿¿

B=¿ [2 3 6 ¿ ] [8 5 7 ¿ ] ¿¿

¿¿

62

Page 63: Introduction to Econometrics 0

Trace (A+B) = trace(A) + trace (B)

Trace (A+B) = (3 + 5 + 4) + ( 2 + 5 + 1)

Trace (A+B) = 12 + 8 = 20.

Exercises

Please calculate the trace of the following matrices. Trace (B+C)

B=¿ [ 2 3 9 ¿ ] [5 4 8 ¿ ]¿¿

¿¿

c= ¿ [1 4 2 ¿ ] [2 5 8 ¿ ] ¿¿

¿

Please calculate the function 2* trace(A)

AND

4* trace (B)

Kronecker product

Let A be a b x c matrix and B be a d x e matrix. Then,

A⊗B=¿ (α11 B α 12B .. .. . .. .. . .. α 1n B ¿) ( ¿ ) ¿¿

¿¿

Cramer’s rule

It is used to solve system of equations that are not homogenous by using determinants as we covered earlier in this section. It is an easy way to find solutions for example of a 3 x 3 matrix. The mathematical formulas for the individual solutions are as follows:

x=det er min ant of xdet er min ant

63

Page 64: Introduction to Econometrics 0

y=det er min ant of ydeterminant

z =

det er min ant of zdeterminant

Consider the following system of equations:

2x + 3y + 2z = 6

x + 2y + 4z = 5

3x + 6y + 7z = 8

It is required to find the solutions for x, y and z.

The first thing is to write our equations in a matrix format and find the determinant of a 3 x 3 matrix. x y z

[ 2 3 2 ¿ ] [1 2 4 ¿ ]¿¿

¿ ¿

D = 2

|2 4 ¿|¿¿

¿¿= 2*(14-24)-3*(7-12)+2(6-6) = -20 +15 +0 = -5

Then, we calculate the determinant of x , y and z by substituting the column

[ 6 ¿ ] [5 ¿ ] ¿¿

¿¿in

place of the values of x, y and z.

64

Page 65: Introduction to Econometrics 0

Determinant of x = Dx =

[ 6 3 2 ¿ ] [5 2 4 ¿ ]¿¿

¿¿

Dx = 6*

|2 4 ¿|¿¿

¿¿6*(14-24)-3*(35-32)+2(30-16)= -60 -9 +28 = -41

Determinant of y = Dy =

[ 2 6 2 ¿ ] [1 5 4 ¿ ] ¿¿

¿¿

Dy = 2*

|5 4 ¿|¿¿

¿¿ 2* (35-32)-6(7-12)+2(8-15) = 6 + 30 -14 = 22

Determinant of z = Dz =

[2 3 6 ¿ ] [1 2 5 ¿ ]¿¿

¿¿

Dz = 2*

|2 5 ¿|¿¿

¿¿= 2 *(16 – 30) -3*(8 -15) +6*(6-6) = -28 +21 +0 = -7

We then substitute the values in the following equations to find the solutions of x, y and z.

x=det er min ant of xdet er min ant =

−41−5

=8 .2

y=det er min ant of ydeterminant =

22−5

=−4 . 4

z =

det er min ant of zdeterminant =

−7−5

=1 . 4

Exercises

Solve the following equations using the Cramer’s rule.

65

Page 66: Introduction to Econometrics 0

3 x+ 2y + 4z = 46x +4y + 6z = 82x +3y + z = 2

4 x+ 3y + 4z = 710x +5y + 6z = 28x +6y + 3z = 4

Eigenvalues and eigenvectors of a 2x2 matrix

Let A be a m x m matrix, the roots of the equation |A - λ . I|=0 are called the eigenvalue of A . λ is a scalar of a unity matrix .

There exists a n x 1 vector v, called eigenvector of the matrix A that is written as ( A−λ1)v1=0 and (A- λ2 ) v2=0 .

Please consider the followig matrix A

A=¿ [2 4 ¿ ] ¿¿

¿¿

Based on the above theoreical equations calculate the eigenvalues and eigenvectors of a 2 x 2 matrix

|A−λ . I|=¿ [ 1 4 ¿ ] ¿¿

¿¿

¿¿

Apply the quadratic equation to solve the above equation anf find the two eigenvalues λ1 and λ2 .

66

Page 67: Introduction to Econometrics 0

λ=−b±√b2−4 ac2 a

λ=−(−1 )±√(−1)2−4∗(−12)∗(1)2∗1

=1±√1+482

=1±√492

=1±72

=

λ1=4 and λ2= -3

Calculate the eigenvectors v1 for the eigenvalues λ1

( A−λ1)v1=0

[1− λ1 4 ¿ ] ¿¿

¿¿

¿

¿

Thus, v1= k1

[−3 ¿ ]¿¿

¿¿

Please repeat the same procedure for v2

The algebraic expression of a linear regression model

67

Page 68: Introduction to Econometrics 0

y=¿ ( y1 ¿ ) ( .¿ ) (. ¿ )¿¿

¿¿, x =

(x11 .. . .. .. . .. x1n ¿) ( ¿ )¿¿

¿¿

Solving simultaneous equations using matrices

Solve the equations

5x + 9y = - 30

6x – 2y = 28

It is required to produce

[1 00 1 ][ x

y ]In a matrix format the equations can be stated as:

[5 96 −2 ] [

xy ]=[

−3028 ]

Invert the square matrix.

1[ (5 x (−2 )]−( 9x 6 )

¿ [−2 -9 ¿ ] ¿¿

¿

68

Page 69: Introduction to Econometrics 0

1−10−54

=− 164

[−2 -9 ¿ ]¿¿

¿¿

[264 9

64 ¿] [ ¿ ]¿

¿¿¿

Multiply both sides of the equation by the inverse.

[ xy ]= [

2/64 9/646/64 −5/64 ] [−30

28 ][ x ¿ ]¿¿

¿¿

[ xy ]=[ 3

−5]So x =3 and y = -5

Please consider another example related to the demand equations of two complements goods and we are trying to find their related prices.

3 P1−4 P2=12−P1+6 P2=10

Solution

Set the simultaneous equations in a matrix format.

[ 3 -4 ¿ ] ¿¿

¿¿

Find the determinant.

69

Page 70: Introduction to Econometrics 0

|A|=[(3 x 6)−(−4 x−1) ]=18−4=14

A−1= 114

¿ [6 4 ¿ ]¿¿

¿=

[614 4

14 ¿]¿¿

¿¿

[ P1 ¿ ] ¿¿

¿¿ =

[614 4

14 ¿]¿¿

¿¿[ 12¿ ]¿¿

¿¿

Thus, P1 = 7.997 and P2 = 2.997

Linear regression

This chapter will focus on a linear regression model with one explanatory variable. We assume that the error term ε has a zero mean value and that it is distributed independently. The purpose is to get unbiased estimates of the regression coefficients. We will focus on three methods of estimation of the parametersa and β . The first technique is the method of of least squares. The second technique is the method of generalized method of moments, and the last technique is the maximum likelihood method, (ML). We will also discuss how to deal with the outliers and the stochastic regressors. Other topics that will be considered are the inverse prediction problem based on Fieller’s method. The problem is basically based on predicting a value of x given a value of y.

A very important method that is used in Econometrics is the regression analysis. But what is the definition of regression analysis. It is related with describing the relationship between a variable entitled the dependent or explained variable with an independent or explanatory variable. The explained or dependent variable is denoted by y and the explanatory or independent variable is denoted by x. In the case of simple regression problems, k = 1, as we have one independent variable.

There are five basic assumptions of the linear and the multiple regression models to be valid. The coefficients have to be BLUE. They must have the smallest possible variance and be consistent to achieve best fit regression analysis. The acronyms of BLUE stands for the following words:

B: Best

70

Page 71: Introduction to Econometrics 0

L: LinearU: UnbiasedE: Estimator

We would like that the coefficients are BLUE by comparing the hypothesis testing of the F-statistic.

The first assumption is that the error term ε is normally distributed with mean zero

and a common variance σ2

. u ~IN(0,σ2)

The second assumption is that the mean of the error term is zero. E((u )=0

The third assumption is that the variance of the error term is the same in each time period for all values of the independent variables. This is a homoskedastic situation.

Var(u )=σ 2.

The fourth assumption is that the numerical value of the error term is uncorrelated

with its value in other time period. ui and u j are independent for all i≠ j .

The fifth assumption is that the independent variables are uncorrelated with the error term, the dependent variable and between them. Thus, ui and x j are independent for all i and j .

If the above assumptions are violated, then, you should consider diagnosing the problems related to multicollinearity, heteroskedasticity, autocorrelation, and errors in variables.

For example, please consider the following two formulas and decide which point estimators are unbiased and which are consistent.

σ 12=∑

i=1

n ( x i− x )2

n−1

σ 22=∑

i=1

n ( x i− x )2

n

The point estimator σ 1

2=∑i=1

n (x i− x )2

n−1 is unbiased.

The point estimators σ 1

2=∑i=1

n (x i− x )2

n−1 and σ 2

2=∑i=1

n (x i− x )2

n−1 are consistent and unbiased.

71

Page 72: Introduction to Econometrics 0

The point estimator σ 2

2=∑i=1

n (x i− x )2

n is biased.

Method of least squares

Let’s take as an example a financial model related to the returns of a share price in relation to market returns.

Number ofobservations

Share returns

Market returns

1 3.526787 8.732092 -4.34533 -5.198153 5.222709 6.218654 -4.99619 -5.53935 -3.04336 7.698086 -2.375422 -4.997357 2.651303 5.427778 -0.68924 -1.54249 0.205664 1.463910 2.4783 3.652811 0.237407 -0.149412 0.329728 0.1668813 -0.26869 -0.144414 0.064769 0.09787315 -0.5873 -0.0991116 0.329225 -0.0834417 -0.11849 0.122767

72

Page 73: Introduction to Econometrics 0

18 0.011541 -0.4576719 -0.18757 -0.5304620 -0.38752 -0.1111821 -0.26835 -0.2894722 0.262798 -0.1767623 0.355054 -1.1568624 -1.34302 -0.577125 -0.77964 0.57818226 -0.04649 -0.0533127 0.098381 -0.2305428 -0.09585 -0.6662529 -0.0059 -0.5007130 -0.05415 -0.53128

The dependent variable is share price returns expressed in percentage and the independent variable is market price returns expressed in percentage.

The mathematical equation will be as follows:

yi = α +βx i+εt

Where: yi = share price returns. It is the dependent variable.

α is the intercept .β is the coefficient of the independent or explanatory variable .x i is the market price returns or the independent variable .ε t is the error term.

We are investigating the changes of market price returns on share price returns. We examine if there are significant effects of the explanatory variable on the dependent variable. Finally, we could forecast the value of the dependent variable y given different values of the independent variable x.

The stochastic equation in Econometrics will incorporate the error term and it will be as follows:

yi = α +β xi+ε t

Our focus will be on the additional tests of the error term in terms of normality, heteroskedasticity and autocorrelation. They will be covered in more detail in the relevant sections. The role of the error term is to predict the element of randomness that is not observed. It is also used to include the effects of independent variables that are omitted. Finally, it is used to show the measurement error that is incorporated in the dependent variable y.

If you still face difficulties concerning how to run and interpret the regression equation, then, please refer to the following document,

73

Page 74: Introduction to Econometrics 0

“Introduction to Statistics, Probability and Econometrics”. There is a detailed section related to regression.

The first step in EViews 6 is to check for normality or descriptive statistics, stationarity and the correlogram of both the dependent and independent variable.

Before loading and transferring your data in EViews 6, you have to insert the numerical values or the data in Excel. Do not use long titles for each time – series. Use abbreviations. For example, share for the first time – series and market for the second time - series. Name the sheet of the Excel file and delete the other sheets. For example, the name of the sheet is reg. Once the Excel file is ready, you close it and you open the statistical package EViews 6. You press file and then you select new worksheet. Then, in workfile structure type, select unstructured / undated. Insert the number of observations. In our case, the number is 30. You will get and untitled worksheet. Then press file, then, import, then, read text - lotus – excel. Select the Excel file and press OK. Excel spreadsheet import screen will open. In the box, upper left data cell, select A2. This is the cell that your first observation starts. In Excel 5 + sheet name write reg. This is the name of your filename. In the box name of series, please write share then press space bar and then market. These are the names of the variables.

Once you done these steps, you will be able to see the time series transferred from Excel to EViews 6. Press file, then, save as. Then, write the filename reg and save it with the extension wf.

You click on the time series for example share to do the following tests. You press view and then you select descriptive statistics and tests……… histogram and stats or correlogram or unit root test. The same thing you do for the file market. You open it and then you press view.

To run a regression, you press quick, and then, estimate equation. In the box, you write

share c market and you press OK.

Then, your output will be displayed. From the top menu you have options to do forecast and check the residuals or error term for further tests.

Good luck !

74

Page 75: Introduction to Econometrics 0

We start with the share price returns histogram and statistics for normality in EViews 6.

0

2

4

6

8

10

12

14

-5 -4 -3 -2 -1 0 1 2 3 4 5 6

Series: SHARESample 1 30Observations 30

Mean -0.127295Median -0.050320Maximum 5.222709Minimum -4.996190Std. Dev. 1.993636Skewness 0.055478Kurtosis 4.702093

Jarque-Bera 3.636790Probability 0.162286

The null and alternative hypotheses are as follows:

H0: The dependent variable is normally distributed

H1: The dependent variable is not normally distributed

75

Page 76: Introduction to Econometrics 0

From the above table the χ2 statistic, namely 3.64, The probability value is above the critical value at 5% significance level or 95% confidence level, so we can not reject H0. We accept H0 and we reject the alternative hypothesis. Even though the distribution is slightly positively skewed and has positive kurtosis. The joint test of the null hypothesis that sample skewness equals 0 and sample kurtosis equals 3 is rejected. The distribution of the dependent variable shows excess kurtosis. It is leptokurtic. The kurtosis is 4.70 which is greater than 3.

Please make sure that you are familiar with the measures of location and dispersion in addition to the graph of the normal distribution. Please compare the mean with the standard deviation. Is the distribution leptokurtic, platykurtic or mesokurtic? Compare the probability at the 5% significance level with the Jarque – Bera statistic. If you have difficulties, then, please review the sections related to skewness and kurtosis.

Jarque –Bera normality test

This section focuses on tests of normality related to the dependent or independent variable. The results of Jarque Bera test is used to test if the series is normal or non-normal. This type of test uses the chi-squared distribution and specifically is a goodness-of-fit test. So we state the hypotheses as follows:

H0: The dependent variable or independent variable are normally distributed

H1: The dependent or independent variable are not normally distributed

The Jarque – Bera test is a test that check the relationship of normality in comparison with skewness and kurtosis.

The mathematical formula is as follows:

JB=n6∗(S2+1

4( K−3 )2) (1)

Where n is the sample size . S is the skewness . K is the kurtosis .

76

Page 77: Introduction to Econometrics 0

The result or the x2 statistic is then compared with the p-value or probability at the 5% significance level. If it is below 5%, then, the test is significant. If it is above 5%, then, it is insignificant and accordingly we accept or reject H0.

For example, we will use the results of the skewness and kurtosis that we have calculated in the section related to measures of dispersion.

Sample size n = 5Skewness = -0.77Kurtosis = -0.58

By substituting the results in equation (1), we have the following results:

JB=56∗[ (−0 .77 )2+ 1

4(−0 . 58−3 )2]=5

6∗(0 . 5929+0 .25∗12 .8164 )=0 .833∗3 .797=3 .16

By checking the chi-square distribution, you will find that the value 3.6 is below the critical value at the 95% confidence level in a two-tailed test. The critical value is 9.49 as the degrees of freedom are 5 -1 = 4. In this case, we cannot reject H0. The dependent or independent variables are normally distributed.

As an example, I have attached a screenshot of the Net Asset value of a UK investment trust. The table and the graph shows measures of location and dispersion in addition to the Jarque – Berra statistic. It is very common normality test that is used in EViews Econometrics software.

NAV of a closed-end fund or investment trust, usually expressed on a per share basis, is the value of all its assets, less its liabilities, divided by the number of shares.

NAV=Total Assets - LiabilitiesNumber of shares

When the share price is below the net asset value it is trading at a discount. Share prices above the net asset value are at a premium. If the closed-end fund is trading at £9.50 and its net asset value is £10, then it is trading at a 5% discount.

77

Page 78: Introduction to Econometrics 0

0

4

8

12

16

-15 -10 -5 0 5 10 15

Series: NAVSample 2 158Observations 156

Mean 1.160473Median 0.957070Maximum 18.62115Minimum -17.81061Std. Dev. 6.583173Skewness 0.199221Kurtosis 3.481567

Jarque-Bera 2.539308Probability 0.280929

From the above table the χ2 statistic, namely 2.54, is below the critical value at 5% significance level, so we accept H0. Even though the distribution is slightly positively skewed and has positive kurtosis.

Please make sure that you are familiar with the measures of location and dispersion in addition to the graph of the normal distribution. Please compare the mean with the standard deviation. Is the distribution leptokurtic, platykurtic or mesokurtic? Compare the probability at the 5% significance level with the Jarque – Bera statistic. If you have difficulties, then, please review the sections related to skewness and kurtosis.

We, then, check the correlogram of the share price returns in EViews 6.

Date: 02/04/15 Time: 14:47Sample: 1 30Included observations: 30

Autocorrelation Partial Correlation AC  PAC  Q-Stat  Prob

     ***| . |      ***| . | 1 -0.428 -0.428 6.0562 0.014     . |**. |      . |* . | 2 0.247 0.078 8.1397 0.017     .**| . |      . *| . | 3 -0.209 -0.097 9.6963 0.021     . |* . |      . | . | 4 0.098 -0.043 10.048 0.040     .**| . |      .**| . | 5 -0.281 -0.279 13.075 0.023     . | . |      .**| . | 6 0.001 -0.292 13.075 0.042     . |* . |      . | . | 7 0.075 0.030 13.310 0.065     . *| . |      . *| . | 8 -0.105 -0.136 13.788 0.087     . |* . |      . *| . | 9 0.112 -0.071 14.358 0.110     . | . |      . *| . | 10 -0.022 -0.072 14.382 0.156     . | . |      . *| . | 11 0.040 -0.136 14.464 0.208     . | . |      . *| . | 12 -0.050 -0.068 14.596 0.264

78

Page 79: Introduction to Econometrics 0

     . | . |      . *| . | 13 0.042 -0.073 14.696 0.327     . | . |      . *| . | 14 -0.064 -0.133 14.940 0.382     . | . |      . *| . | 15 0.022 -0.074 14.971 0.453     . | . |      . | . | 16 0.018 -0.055 14.993 0.525     . | . |      . *| . | 17 -0.039 -0.128 15.106 0.588     . | . |      . *| . | 18 -0.018 -0.173 15.132 0.653     . | . |      . *| . | 19 0.042 -0.115 15.286 0.704     . |* . |      . | . | 20 0.077 0.035 15.858 0.725

The hypotheses that have been formulated and tested are as follows:

H0: The time series of the share price returns have no serial correlation.

H1: The time series of the share price returns have serial correlation.

According to the above table, the autocorrelations, and the Q-statistic associated with the p-values is statistically significant at the first observations, but then it is insignificant, which is a sign that there is no serial correlation.

When using seasonal data as dummies check the correlogram first. Check the autocorrelation and partial autocorrelation and determine whether the data are stationary or you need to differentiate them to become stationary. In other words, check the correlogram to determine the autoregressive moving average model that will be used, (ARMA). For example, if you have 200 observations and only 3 observations are significant in terms of serial correlations, then use a moving average, MA(3).

Finally, we check the ADF unit root test of the share price returns in EViews 6.

Null Hypothesis: SHARE has a unit rootExogenous: ConstantLag Length: 0 (Automatic based on SIC, MAXLAG=7)

t-Statistic   Prob.*

Augmented Dickey-Fuller test statistic -8.885614  0.0000Test critical values: 1% level -3.679322

5% level -2.96776710% level -2.622989

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test EquationDependent Variable: D(SHARE)Method: Least SquaresDate: 02/04/15 Time: 14:48

79

Page 80: Introduction to Econometrics 0

Sample (adjusted): 2 30Included observations: 29 after adjustments

Variable Coefficient Std. Error t-Statistic Prob.

SHARE(-1) -1.427826 0.160690 -8.885614 0.0000C -0.308837 0.321027 -0.962026 0.3446

R-squared 0.745173    Mean dependent var -0.123481Adjusted R-squared 0.735735    S.D. dependent var 3.355847S.E. of regression 1.725132    Akaike info criterion 3.994956Sum squared resid 80.35413    Schwarz criterion 4.089252Log likelihood -55.92686    Hannan-Quinn criter. 4.024488F-statistic 78.95413    Durbin-Watson stat 1.604154Prob(F-statistic) 0.000000

The critical value of the t-statistic Dickey-Fuller’s table is -8.89. The sample evidence, we can reject the null hypothesis namely the existence of a unit root with one, five and ten per cent significance level. The ADF test statistic is -8.89, which is smaller than the critical values, (-3.68, -2.97, -2.62). In other words, the share price returns are a stationary series. I did not include a trend. Could you see that the t-statistic of the share (-1) is the same as the Augmented Dickey-Fuller test statistic. We are testing the null hypothesis that the share price has a unit root. Therefore, if the ADF statistic is greater than the critical value, we accept the hull hypothesis of unit root.

Stationarity

A non-stationary series tends to show a statistically significant spurious correlation when variables are regressed. Thus, we have significant R2 . We test whether NAVof UK investment trusts follow a random walk, a random walk with drift and trend or are stationary. In this section, I will illustrate the EViews output.

For non-stationary series the mathematical formulas for random walk and random woalk with drift are as follows:

Random walk : yt = yt-1 + ε t

Random walk with drift: yt = μ+ y t−1+εt

Where:

μ is the drift .y t-1 is the dependent variable lagged one period .ε t is the error term or the residuals .

The unit root test

A popular test of stationarity is the unit root test. The specifications of the test are the following:

80

Page 81: Introduction to Econometrics 0

Δy t=γy t−1+εt

Where the null hypothesis to be tested is γ =1. ε t is the stochastic error term that is assumed to be non-autocorrelated with a zero mean and with a constant variance. Such an error term is also known as a white noise error term.

The main problem when performing ADF test is to decide whether to include a constant term and a linear trend or neither in the test regression. The general principle is to choose a specification that is a plausible description of the data under both the null and alternative hypothesis (Hamilton 1994, p.501). If the series seems to contain a trend we should include both a constant and trend in the test regression. If the series seems not to contain a trend we should include neither a constant nor a trend in the test regression. We start by testing if the NAV in the UK follow simple random walks (with no constant and a time trend) or are stationary. We state the hypothesis as follows:

Ho: γ=0 H1: γ < 0

The ADF test for NAV of UK investment trusts sector defined by AITC is as follows:

ADF test of the NAV return by excluding a constant and a trend.

Table 1 shows ADF test of the NAV return by all AITC sector for the period January 1990 to January 2003 for two different critical values one per cent and five per cent. We test if NAV return follows a random walk by excluding a constant and a linear time trend.2

ADF Test Statistic -4.189743 1% Critical Value* -2.5798 5% Critical Value -1.9420

*MacKinnon critical values for rejection of hypothesis of a unit root.Source: calculated by the author

For a level of significance of 1 per cent and a sample size larger than 100 observations, the critical value of the t-statistic from Dickey-Fuller’s tables for no intercept and no trend is -2.58. According to Table 1, we can reject the null hypothesis namely the existence of a unit root with one per cent significance level. The ADF test statistic is -4.19. In other words, the NAV return is stationary.

2

81

Page 82: Introduction to Econometrics 0

The following tables summarise the unit root test with constant and time trend for NAV for UK investment trusts. The specifications and hypothesis of the test are the following:

ΔY t=μ+γyt−1+∑λ=4

aλ Δyt− λ+ βt +εt

Whereμ is the drift, ∑λ=4

aλ Δyt− λ are lags included so that ε t contains no

autocorrelation, is the measure of stationarity, βt is a measure of time trend.

We state the hypothesis as follows:H0 : β , γ=0 (existence of a unit root) H1 : β , γ < 0 (stationarity)

The existence of a unit root is measured using an ADF test. For a 1 per cent significance level and a sample size larger than 100 observations, the critical value of the t-statistic from Dickey-Fuller’s tables is - 4.02. Table 2 summarises the unit root test of NAV return for UK investment trusts sector by AITC.

Table 2 ADF test of UK NAV return by including a constant and a trend.

Table 2 shows ADF test for the period January 1990 to January 2003 for two different critical values one per cent and five per cent. We test if NAV return follows a random walk by including a constant and a linear time trend.3

ADF Test Statistic -4.531134 1% Critical Value* -4.0237 5% Critical Value -3.4413

*MacKinnon critical values for rejection of hypothesis of a unit root. F-statistic 17.97964

Source: calculated by the author

According to the Table 2, the sample evidence suggests that we can reject the null hypothesis namely the existence of a unit root with one per cent significance level. The t-statistics are greater than the critical value of -4.02 with one per cent significance level. The t-statistic for all UK sectors is -4.53. Thus the NAV return is stationary. To check if there is a time trend we compare the F-statistic of the model with the one given from the tables of ADF. From our model F statistic 17.98 6.34 so we reject the null hypothesis. 3

82

Page 83: Introduction to Econometrics 0

Please review the F-statistic concept stated in the regression section.

Let’s solve a detailed numerical example to understand the ADF unit root test with and without a trend.

Please consider the following time series in different time periods. The time series represent the return of the share prices of a hypothetical supermarket in Boscombe. It is located in the South - West of England.

T = trend yt = dependent variable1 -2.34782 -1.27313 0.84674 0.78295 3.03726 4.34057 0.73418 -0.89129 -2.382410 -1.482711 -0.856812 0.078513 2.486714 3.644815 4.752216 2.864417 3.356518 5.324719 4.312420 5.237621 4.786322 3.289723 6.372924 7.125625 2.638926 3.1567

Our purpose is to test if there is a unit root or not. By differentiating the time series y t, we will get a white noise series. We are testing the null hypothesis of the existence of

83

Page 84: Introduction to Econometrics 0

a unit root. Unit root tests should be carried in all the variables dependent and independent before deciding if the statistisician / econometrician will run a regression or a cointegration integrated with an error correction model.

The null hypothesis of no unit root is H0 : β=γ=0 . By including a trend we check the F-test in relation to the tables in the appendix of the ADF critical values at the end of the Econometrics book.

If there is no trend, then, H0 : γ=0 and we check the t – statistic from the regression equation in relation to the appendix of the ADF critical values at the end of the Econometrics book.

The ADF critical values table is compromised from your sample n that includes the individual observations. The next three columns compromise three titles. No intercept and no trend, intercept and no trend and intercept and trend.

We want to make sure that our time series is stationary before running the regression with the independent variable. In EViews you will have three options. The first option is to check for unit root for both the dependent and independent variables by using their level. If the time series is stationary, then, it can be used to run a regression without differencing. If it is not stationary, then, you click on 1st difference. If the problem continues, then, click on 2nd difference.

The firt step is to differentiate the time series yt in terms of Δy and then run a regression on yt-1, which is the dependent variable lagged one period. Thus, by differentiating we loose the first observation. By using the lagges expression y t-1, you loose the last observation. Let me explain and illustrate this in more detail. I have included an example to show the calculation.

T = trend yt = dependent variable

Δy yt-1

-2.34781

-1.2731-1.2731 – (- 2.3478) = 1.0747 -2.3478

20.8467

0.8467 – (-1.2731) = 2.1198 -1.2731

3 0.7829 -0.0638 0.84674 3.0372 2.2543 0.78295 4.3405 1.3033 3.03726 0.7341 -3.6064 4.34057 -0.8912 -1.6253 0.73418 -2.3824 -1.4912 -0.89129 -1.4827 0.8997 -2.382410 -0.8568 0.6259 -1.482711 0.0785 0.9353 -0.856812 2.4867 2.4082 0.078513 3.6448 1.1581 2.486714 4.7522 1.1074 3.644815 2.8644 -1.8878 4.752216 3.3565 0.4921 2.8644

84

Page 85: Introduction to Econometrics 0

17 5.3247 1.9682 3.356518 4.3124 -1.0123 5.324719 5.2376 0.9252 4.312420 4.7863 -0.4513 5.237621 3.2897 -1.4966 4.786322 6.3729 3.0832 3.289723 7.1256 0.7527 6.372924 2.6389 -4.4867 7.125625 3.1567 0.5178 2.6389

Then, we run the regression of Δy on y t-1 to check for unit root without a trend.

Δy t yt-1

1.0747 -2.34782.1198 -1.2731

-0.0638 0.84672.2543 0.78291.3033 3.0372

-3.6064 4.3405-1.6253 0.7341-1.4912 -0.89120.8997 -2.38240.6259 -1.48270.9353 -0.85682.4082 0.07851.1581 2.48671.1074 3.6448

-1.8878 4.75220.4921 2.86441.9682 3.3565

-1.0123 5.32470.9252 4.3124

-0.4513 5.2376-1.4966 4.78633.0832 3.28970.7527 6.3729

-4.4867 7.12560.5178 2.6389

85

Page 86: Introduction to Econometrics 0

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.420665

R Square 0.176959Adjusted R Square 0.141175

Standard Error 1.713097

Observations 25

ANOVA

df SS MS FSignificanc

e F

Regression 1 14.5125514.5125

54.94515

4 0.036267

Residual 23 67.498152.93470

2

Total 24 82.0107

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept 0.855414 0.446081.91762

3 0.06766 -0.06737 1.7782 -0.06737 1.7782

yt-1 -0.2797 0.125776 -2.223770.03626

7 -0.53989-

0.01951 -0.53989 -0.01951

The regression is Δ yt=0 .86 − 0 .28 y t−1

The t –statistic is (-2.22)

Then, we compate the t-statistic with the ADF critical values. In our case t = -2.22 >

86

Page 87: Introduction to Econometrics 0

-3.33. I have found the value – 3.33 in the table related to intercept but no trend with a sample size n = 25. The sample evidence suggest that we could not reject the null hypothesis. There is a unit root.

The time series is not stationary. We differentiate the values once again. In other words, we subtract each numerical value from the previous one. We subtract the value at period t from the numerical value at period t-1.

Δ Δyt

Δ yt-1

1.0451 1.0747-2.1836 2.11982.3181 -0.0638-0.951 2.2543

-4.9097 1.30331.9811 -3.60640.1341 -1.62532.3909 -1.4912

-0.2738 0.89970.3094 0.62591.4729 0.9353

-1.2501 2.4082-0.0507 1.1581-2.9952 1.10742.3799 -1.88781.4761 0.4921

-2.9805 1.96821.9375 -1.0123

-1.3765 0.9252-1.0453 -0.45134.5798 -1.4966

-2.3305 3.0832-5.2394 0.75275.0045 -4.4867

87

Page 88: Introduction to Econometrics 0

SUMMARY OUTPUT

Regression StatisticsMultiple R 0.704163R Square 0.495845Adjusted R Square 0.472929Standard Error 1.921617Observations 24

ANOVA

df SS MS FSignificanc

e FRegression 1 79.89853 79.89853 21.6374 0.000123Residual 22 81.23746 3.692612Total 23 161.136

CoefficientsStandard

Error t Stat P-value Lower 95%Upper 95%

Lower 95.0%

Intercept 0.181997 0.394721 0.461078 0.649268 -0.63661 1.0006 -0.63661

Δ yt-1 -0.98759 0.212313 -4.6516 0.000123 -1.4279-

0.54728 -1.4279 -0.54728

The regression is ΔΔ { y t=0.18 − 0 . 99 Δyt−1¿ The t –statistic is (-4.65)

Then, we compare the t-statistic with the ADF critical values. In our case t = -4.65 <-3.33. I have found the value – 3.33 in the table related to intercept but no trend. The sample evidence suggests that we could reject the null hypothesis.

There is no unit root. The time series is stationary and can be used for regression analysis.

We run the same regression by including the trend.

88

Page 89: Introduction to Econometrics 0

T = trend Δy t yt-1

1 1.0747 -2.34782 2.1198 -1.27313 -0.0638 0.84674 2.2543 0.78295 1.3033 3.03726 -3.6064 4.34057 -1.6253 0.73418 -1.4912 -0.89129 0.8997 -2.382410 0.6259 -1.482711 0.9353 -0.856812 2.4082 0.078513 1.1581 2.486714 1.1074 3.644815 -1.8878 4.752216 0.4921 2.864417 1.9682 3.356518 -1.0123 5.324719 0.9252 4.312420 -0.4513 5.237621 -1.4966 4.786322 3.0832 3.289723 0.7527 6.372924 -4.4867 7.125625 0.5178 2.6389

SUMMARY OUTPUT

89

Page 90: Introduction to Econometrics 0

Regression Statistics

Multiple R 0.474019

R Square 0.224694Adjusted R Square 0.154212

Standard Error 1.700045

Observations 25

ANOVA

df SS MS FSignificanc

e F

Regression 2 18.427329.21365

83.18794

7 0.060842

Residual 22 63.583382.89015

4

Total 24 82.0107

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept 0.169658 0.7369850.23020

60.82005

9 -1.358761.69807

3 -1.35876 1.698073

yt-1 -0.43048 0.1799 -2.392880.02568

6 -0.80357 -0.05739 -0.80357 -0.05739

T = trend 0.079092 0.0679581.16383

70.25695

7 -0.061840.22002

9 -0.06184 0.220029

The regression is Δ y t=0 .17+0. 08 T− 0 . 43 y t−1

The t –statistics are (1.16) (-2.39)

Then, we compate the F-statistic with the ADF critical values. In our case F = 3.19 < 7.24. Again, the time series by including the trend is not stationary.

Please, repeat the steps as above by differentiating once again the series and including the trend. Comment on your result.

90

Page 91: Introduction to Econometrics 0

We repeat the same procedure for the independent variable market price returns in EViews 6.

We start with the market price returns histogram and statistics for normality inEViews 6.

0

2

4

6

8

10

12

14

16

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9

Series: MARKETSample 1 30Observations 30

Mean 0.370795Median -0.146900Maximum 8.732090Minimum -5.539300Std. Dev. 3.236802Skewness 0.820624Kurtosis 4.142626

Jarque-Bera 4.999109Probability 0.082122

The null and alternative hypotheses are as follows:

H0: The independent variable is normally distributed

H1: The independent variable is not normally distributed

From the above table the χ2 statistic, namely 4.999, The probability value is above the critical value at 5% significance level, so we can not reject H0.

We, then, check the correlogram of the market price returns in EViews 6.

91

Page 92: Introduction to Econometrics 0

Date: 02/04/15 Time: 14:52Sample: 1 30Included observations: 30

Autocorrelation Partial Correlation AC  PAC  Q-Stat  Prob

   *****| . |    *****| . | 1 -0.744 -0.744 18.303 0.000     . |***** |      . |**. | 2 0.684 0.293 34.337 0.000    ****| . |      . |* . | 3 -0.494 0.204 43.008 0.000     . |*** |      . | . | 4 0.410 -0.001 49.229 0.000     . *| . |      . |**. | 5 -0.195 0.263 50.691 0.000     . |* . |      . | . | 6 0.128 0.030 51.347 0.000     . | . |      . | . | 7 0.016 0.057 51.358 0.000     . | . |      . | . | 8 -0.043 0.035 51.440 0.000     . |* . |      . | . | 9 0.102 -0.056 51.916 0.000     . | . |      . |* . | 10 -0.013 0.206 51.924 0.000     . | . |      . *| . | 11 -0.005 -0.072 51.926 0.000     . | . |      .**| . | 12 -0.012 -0.283 51.933 0.000     . | . |      . | . | 13 -0.023 -0.065 51.963 0.000     . | . |      .**| . | 14 -0.025 -0.231 52.001 0.000     . | . |      . *| . | 15 -0.002 -0.137 52.001 0.000     . | . |      . | . | 16 -0.025 0.063 52.045 0.000     . | . |      . *| . | 17 -0.010 -0.081 52.052 0.000     . | . |      . | . | 18 -0.043 0.002 52.202 0.000     . | . |      . *| . | 19 -0.023 -0.093 52.250 0.000     . | . |      . *| . | 20 -0.023 -0.114 52.300 0.000

Finally, we check the ADF unit root test of the market price returns in EViews 6.

92

Page 93: Introduction to Econometrics 0

Null Hypothesis: MARKET has a unit rootExogenous: ConstantLag Length: 0 (Automatic based on SIC, MAXLAG=7)

t-Statistic   Prob.*

Augmented Dickey-Fuller test statistic -19.81362  0.0001Test critical values: 1% level -3.679322

5% level -2.96776710% level -2.622989

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test EquationDependent Variable: D(MARKET)Method: Least SquaresDate: 02/04/15 Time: 14:52Sample (adjusted): 2 30Included observations: 29 after adjustments

Variable Coefficient Std. Error t-Statistic Prob.

MARKET(-1) -1.744788 0.088060 -19.81362 0.0000C 0.381806 0.286830 1.331124 0.1943

R-squared 0.935650    Mean dependent var -0.319427Adjusted R-squared 0.933267    S.D. dependent var 5.933619S.E. of regression 1.532821    Akaike info criterion 3.758569Sum squared resid 63.43760    Schwarz criterion 3.852865Log likelihood -52.49925    Hannan-Quinn criter. 3.788101F-statistic 392.5795    Durbin-Watson stat 1.286718Prob(F-statistic) 0.000000

Please comment on the ADF unit root test result.

I will include the steps on how to perform a multiple regression in Excel.

93

Page 94: Introduction to Econometrics 0

1. Plot or insert the data in Excel.2. Press tools…..then select data analysis……then select regression…..3. Then in input Y range select the label and the data of the dependent variable.

In our case, it is the share price returns. 4. Then, in input X range select the labels and the data of the independent

variable. In our case, it is the market price returns. 5. Select the box labels and confidence levels. Select the required confidence

levels. For example, 90%, 95% or 99%. 6. Press output range box and select the cell that your output will be displayed.7. Finally, select the box residuals in order that the residuals table and data will

be displayed.

The regression output in Excel and EViews 6 is as follows:

94

Page 95: Introduction to Econometrics 0

The output of regression and the residuals in ExcelSUMMARY OUTPUT

Regression Statistics

Multiple R 0.686353R Square 0.47108Adjusted R Square 0.452191Standard Error 1.475573Observations 30

ANOVA

df SS MS FSignificance

FRegression 1 54.29814441 54.29814441 24.93811 2.82E-05Residual 28 60.96484018 2.177315721Total 29 115.2629846

CoefficientsStandard

Error t Stat P-value Lower 95%Upper 95%

Lower 95.0%

Upper 95.0%

Intercept -0.28405 0.27122402 -1.047275314 0.303929 -0.83962 0.271532 -0.83962 0.271532Market 0.422744 0.084653626 4.993807416 2.82E-05 0.249339 0.596149 0.249339 0.596149

y i=−0 .28+0 . 42 x1+εt

(-1.05) (4.99)

RESIDUAL OUTPUT

ObservationPredicted

Share Residuals1 3.407391606 0.1193953942 -2.481532452 -1.8637975483 2.344850165 2.8778588354 -2.625751535 -2.3704384655 2.97027018 -6.01363018

95

Page 96: Introduction to Econometrics 0

6 -2.396645476 0.0212234767 2.010510465 0.6407925358 -0.93608642 0.246846429 0.334808582 -0.129144582

10 1.260152716 1.21814728411 -0.347204161 0.58461116112 -0.213498718 0.54322671813 -0.345090441 0.07640044114 -0.242671007 0.30744000715 -0.32594437 -0.2613556316 -0.319319973 0.64854497317 -0.23214722 0.1136572218 -0.477523424 0.48906442419 -0.508294953 0.32072495320 -0.331046889 -0.05647311121 -0.406417899 0.13806789922 -0.358770434 0.62156843423 -0.773101735 1.12815573524 -0.528011729 -0.81500827125 -0.039623305 -0.74001669526 -0.306582699 0.26009269927 -0.381505601 0.47988660128 -0.565699348 0.46984934829 -0.495718322 0.48981832230 -0.508641603 0.454491603

The residuals are calculates as actual y minus predicted y. In terms of mathematical formula, the notation is as follows:

ε= y− y .

For example, for the first observation, the predicted error term is calculated as follows:

ε 1=3 . 526787− 3.407391606 = 0.119395

ε 2= -4.34533 – (-2.481532452) = -1.8637975

ε 3=5 .222709− 2.344850165 = 2.8778588ε 4=−4 . 99619− (-2.625751535) = -2.370438

The regression output in EViews 6

Before loading and transferring your data in EViews 6, you have to insert the numerical values or the data in Excel. Do not use long titles for each time – series. Use abbreviations. For example, share for the first time – series and market for the second time - series. Name the sheet of the Excel file and delete the other sheets. For example, the name of the sheet is reg. Once the Excel file is ready, you close it and you open the statistical package EViews 6. You press file and then you select new

96

Page 97: Introduction to Econometrics 0

worksheet. Then, in workfile structure type, select unstructured / undated. Insert the number of observations. In our case, the number is 30. You will get and untitled worksheet. Then press file, then, import, then, read text - lotus – excel. Select the Excel file and press OK. Excel spreadsheet import screen will open. In the box, upper left data cell, select A2. This is the cell that your first observation starts. In Excel 5 + sheet name write reg. This is the name of your filename. In the box name of series, please write share then press space bar and then market. These are the names of the variables.

Once you done these steps, you will be able to see the time series transferred from Excel to EViews 6. Press file, then, save as. Then, write the filename reg and save it with the extension wf.

The dependent variable is share price returns and the independent variable is market price returns. To get the output of the regression equation, press quick from the menu at the top of the EViews 6 screen, then estimate equation, then keep the option of the least squares regression equation. In equation specification, please write the dependent variable followed by regressors. Thus, you write share c market. Then, press OK and you will get the following output.

Dependent Variable: SHAREMethod: Least SquaresDate: 02/04/15 Time: 14:57Sample: 1 30Included observations: 30

Variable Coefficient Std. Error t-Statistic Prob.

C -0.284046 0.271224 -1.047275 0.3039MARKET 0.422744 0.084654 4.993807 0.0000

R-squared 0.471080    Mean dependent var -0.127295Adjusted R-squared 0.452191    S.D. dependent var 1.993636S.E. of regression 1.475573    Akaike info criterion 3.680310Sum squared resid 60.96484    Schwarz criterion 3.773723Log likelihood -53.20465    Hannan-Quinn criter. 3.710194F-statistic 24.93811    Durbin-Watson stat 1.868822Prob(F-statistic) 0.000028

y=−0 .28+0. 42 x1

Correlogram of residuals in EViews 6

Date: 02/04/15 Time: 14:58Sample: 1 30Included observations: 30

Autocorrelation Partial Correlation AC  PAC  Q-Stat  Prob

97

Page 98: Introduction to Econometrics 0

     . | . |      . | . | 1 0.064 0.064 0.1347 0.714     .**| . |      .**| . | 2 -0.264 -0.270 2.5331 0.282     . |* . |      . |* . | 3 0.142 0.196 3.2497 0.355     . | . |      . | . | 4 0.051 -0.063 3.3461 0.502     . *| . |      . | . | 5 -0.103 -0.010 3.7519 0.586     . *| . |      . *| . | 6 -0.090 -0.117 4.0744 0.667     . | . |      . | . | 7 0.014 0.008 4.0827 0.770     . | . |      . *| . | 8 -0.028 -0.072 4.1173 0.846     . | . |      . | . | 9 -0.000 0.053 4.1173 0.904     . | . |      . | . | 10 0.020 -0.023 4.1374 0.941     . | . |      . | . | 11 -0.009 0.011 4.1413 0.966     . | . |      . | . | 12 -0.031 -0.055 4.1930 0.980     . | . |      . | . | 13 0.000 0.006 4.1930 0.989     . *| . |      . *| . | 14 -0.079 -0.124 4.5715 0.991     . | . |      . |* . | 15 0.020 0.079 4.5970 0.995     . | . |      . *| . | 16 0.016 -0.071 4.6150 0.997     . *| . |      . | . | 17 -0.071 -0.004 4.9861 0.998     . *| . |      . *| . | 18 -0.116 -0.175 6.0713 0.996     . |* . |      . |* . | 19 0.079 0.121 6.6174 0.996     . |* . |      . | . | 20 0.154 0.040 8.9027 0.984

Histogram, descriptive statistics and normality test of the residuals in EViews 6

The following figure plots the histogram of the residuals in addition to JB statistics. The probability value 0.00 is highly significant and we can reject the null hypothesis of normality.I have attached the actual, the fitted and the residual data in addition to the residual plot. The residual plot is usually illustrated through a line in Eviews, but when I paste it in word it appears as asterix.

98

Page 99: Introduction to Econometrics 0

Actual Fitted Residual Residual Plot3.526787 3.348225696712871 0.1785613032871293 | . * . |

-4.345329999999999 -2.924408515162379 -1.420921484837621 | * | . |5.222709 2.377959223653067 2.844749776346933 | . | . * |

-4.996190000000001 -2.606486135803714 -2.389703864196287 | * . | . |-3.04336 2.821246722897354 -5.864606722897354 | * . | . |

-2.375422 -2.017710802082159 -0.3577111979178414 | . *| . |2.651303 2.287338015623336 0.3639649843766633 | . |* . |-0.68924 -1.002409368614623 0.3131693686146225 | . |* . |0.205664 0.3665048570755033 -0.1608408570755033 | . * . |

2.4783 1.100434691872405 1.377865308127595 | . | * |0.237407 -0.4723536224359087 0.7097606224359088 | . |* . |0.329728 -0.1440923644346374 0.4738203644346374 | . |* . |-0.26869 -0.5446244098459764 0.2759344098459764 | . * . |

0.06476899999999999 0.3875254689490014 -0.3227564689490014 | . *| . |-0.5873 -0.6335846793477813 0.04628467934778119 | . * . |

0.329225 -0.4311005888218602 0.7603255888218602 | . |* . |-0.11849 -0.141179432252036 0.022689432252036 | . * . |0.011541 -0.6905806353841579 0.7021216353841579 | . |* . |-0.18757 -0.3146511194657075 0.1270811194657074 | . * . |-0.38752 -0.221434459922134 -0.166085540077866 | . * . |-0.26835 -0.1365414023303287 -0.1318085976696712 | . * . |0.262798 -0.4579100953393257 0.7207080953393257 | . |* . |0.355054 -1.059188816827552 1.414242816827552 | . | * |-1.34302 -0.3796731122458502 -0.9633468877541499 | .* | . |-0.77964 -0.2306442604186535 -0.5489957395813465 | . *| . |-0.04649 -0.2770687944360496 0.2305787944360496 | . * . |0.098381 -0.4229785741388308 0.5213595741388308 | . |* . |-0.09585 -0.1460365944041081 0.05018659440410811 | . * . |-0.0059 -0.5292816215985063 0.5233816215985063 | . |* . |

-0.05415 -0.724141271471262 0.6699912714712619 | . |* . |

I have also attched the residual, the actual and fitted values.

-6

-4

-2

0

2

4

-6

-4

-2

0

2

4

6

5 10 15 20 25 30

Residual Actual Fitted

99

Page 100: Introduction to Econometrics 0

0

4

8

12

16

20

-7 -6 -5 -4 -3 -2 -1 0 1 2 3

Series: ResidualsSample 1 30Observations 30

Mean 7.40e-18Median 0.283766Maximum 2.877859Minimum -6.013630Std. Dev. 1.449909Skewness -2.409201Kurtosis 11.50697

Jarque-Bera 119.4820Probability 0.000000

F-statistic 1.108070    Prob. F(2,26) 0.3453Obs*R-squared 2.356247    Prob. Chi-Square(2) 0.3079

Test Equation:Dependent Variable: RESIDMethod: Least SquaresDate: 02/04/15 Time: 14:59Sample: 1 30Included observations: 30Presample missing value lagged residuals set to zero.

Variable Coefficient Std. Error t-Statistic Prob.

C -0.014204 0.260518 -0.054522 0.9569MARKET 0.018418 0.082712 0.222674 0.8255RESID(-1) 0.086741 0.184331 0.470571 0.6419RESID(-2) -0.276039 0.183629 -1.503246 0.1448

R-squared 0.078542    Mean dependent var 7.40E-18Adjusted R-squared 0.045632    S.D. dependent var 1.449909S.E. of regression 1.416441    Akaike info criterion 3.731846Sum squared resid 56.17657    Schwarz criterion 3.918672Log likelihood -51.97769    Hannan-Quinn criter. 3.791613

100

Page 101: Introduction to Econometrics 0

F-statistic 0.795537    Durbin-Watson stat 1.873169Prob(F-statistic) 0.506731

Numerical example of simple regression

Company sales are affected by its marketing expenditures. The research and statistics department has recorded seven observations. Find the equation of the least-squares regression line, assuming that company sales is the dependent variable (y) and marketing expenditures is the independent variable (x). Find the standard errors of the parameters α and β . Test the hypotheses that the true value of β=1 .10 using the t-distribution with 75% confidence level. Calculate the confidence intervals. Calculate the F-statistic and compare it with the Excel output. Find the equation of the estimated residuals. Find the coefficient of determination and the correlation coefficient between company sales and marketing expenditures and comment on the result. Plot a scatter diagram. Predict the effect on monthly sales when the marketing expenditures rise to 8. Calculate the variance of the prediction error, the standard error and the confidence interval.

Company sales (000) Pounds Marketing expenditures (000) Pounds10 112 214 317 418 519 720 8

Solution

In order to find the least – squares regression, we should estimate the parameters α

and β .

α= y− β x

101

Page 102: Introduction to Econometrics 0

y=∑ yn

x=∑ xn

β=Sxy

Sxx=∑ xi yi−n x y

∑ xi2−n x2

From the above table, you can calculate x2, y2 and xy.

Company sales, y, (000) Pounds

Marketing expenditures, x, (000) Pounds

y^2 x^2 xy10 1 100 1 1012 2 144 4 2414 3 196 9 4217 4 289 16 6818 5 324 25 9019 7 361 49 13320 8 400 64 160

Total 110 30 1814 168 527

102

Page 103: Introduction to Econometrics 0

y=∑ yn

=1107

=15 . 71

x=∑ xn

=307

=4 .2857

β=Sxy

Sxx=∑ xi yi−n x y

∑ xi2−n x2 =527−7∗4 .2857∗15 . 71

168−7∗(4 . 2857 )2 =527−471 . 298168−128 .57

=55 . 70239 . 43

=1 .41

α= y− β xα=15 . 71−1. 41∗4 .2857=15 .71−6 .04=9. 67

The least - squares regression equation is as follows:

y=9. 67+1 . 41 xi+εt

I have run the regression in Excel and I have attached the output. Please compare the figures and make sure that you understand the whole procedure and calculations. If you still experience difficulties, then, please e-mail me.

SUMMARY OUTPUT

Regression StatisticsMultiple R 0.957513R Square 0.91683Adjusted R Square 0.900196Standard Error 1.192063Observations 7

ANOVA

df SS MS FSignificance

FRegression 1 78.3235 78.3235 55.11802 0.00069847Residual 5 7.105072 1.421014Total 6 85.42857

103

Page 104: Introduction to Econometrics 0

CoefficientsStandard

Error t Stat P-value Lower 95%Intercept 9.673913 0.930035 10.40166 0.000141 7.28318554Marketing expenditures, x, (000) Pounds 1.40942 0.189843 7.424151 0.000698 0.92141508

RESIDUAL OUTPUT

Observation

Predicted Company

sales, y, (000) Pounds Residuals

1 11.08333 -1.083332 12.49275 -0.492753 13.90217 0.0978264 15.31159 1.6884065 16.72101 1.2789866 19.53986 -0.539867 20.94928 -0.94928

The least - squares regression equation is as follows:

y=9. 67+1 . 41 xi+εt

The standard errors of the parameters α and β are calculated as follows:

First of all, we calculate the variances of α and { β .¿

V ( a )=σ2( 1n+ x2

Sxx) (1)

V ( β )= σ2

Sxx (2)

Then, we estimate the σ2

by using the following equation:

104

Page 105: Introduction to Econometrics 0

σ2=1

n−2 (S yy−Sxy

2

Sxx) (3 )

S yy=∑ y i2−n y2=1814−7∗(15 .71 )2=1814−1727 .6287=86.3713

Sxx = ∑ x i2−n x2= 168 – 7*(4.2857)2 = 168 – 128.57057 = 39.4294

Sxy=∑ x i y i−n( x )( y )=527−7∗4 . 2857∗15 . 71=527−471 .2984=55 . 70

The standard errors of the parameters α and { β will be as follows:¿

SE( α )=√V ( α)∗σ2

SE( β )=√V ( β )∗σ2

Please complete the calculations…………

Calculation of the confidence intervals for α and β . I have attached the Excel output for the lower and upper value.

CoefficientsStandard Error t Stat P-value Lower 95%

Upper 95%

Intercept 9.673913 0.930035 10.40166 0.000141 7.28318554 12.06464Marketing expenditures, x, (000) Pounds 1.40942 0.189843 7.424151 0.000698 0.92141508 1.897425

With 95% confidence level, we are using the cumulative student’s t - distribution. The confidence intervals for α are calculated as follows:

α±2 .571 SE( α )9 . 67+2 .571∗0 .93=12 .06And9 . 67−2. 571∗0 . 93=7 . 28

We got the value of 2. 571 by checking the t-table. We have seven degrees of freedom . The formula is n-2 .Thus, we are looking to the corresponding t-value with five degrees of freedom. In this case, it is 2 . 571.Please compare the calculations with the table to be able to understand.

With 95% confidence level, we are using the cumulative student’s t distribution. The confidence intervals for β are calculated as follows:

105

Page 106: Introduction to Econometrics 0

β±2 .571 SE( β )1 .41+2. 571∗0 . 19=1 . 9And1 .41−2 .571∗0 .19=0 . 92

Test the hypotheses that the true value of β=1 .10 using the t-distribution with 75%

confidence level. We have already calculated thatβ=1 . 41 and SE( β )= 0.19.

Please state the hypotheses:

H0: β=1 .10H1: β≠1 .10

Please use the following equation:

t= β−βSE( β )

t=1 .41−1 .100 .19 =1.63

The degrees of freedom are n – 2 or 7 – 2 = 5. With 75% confidence level, the t- critical value is 0.727. The calculated t-statistic is 1.63 and is higher than the critical value of 0.727. Therefore, the suggested evidence suggests that we can reject the null hypothesis with 75% confidence level. Because the difference is 1.63 standard errors, the null hypothesis would be rejected at the 75% level of significance, and the statistician would assume that the true value of the parameter β is not equal to 1.10.

The F-statistic is calculated as follows:

F= r 2

(1−r 2)/ (n−2 )= 0 .91683

(1−0 . 91683)/ (7−2 )= 0 .91683

0 . 016634=55 . 12

It could also be calculated by using another equation:

The F statistic is equals to the ratio of the sum of squares of the explained variation, MSR dividedby the sum of squares of the unexplained variation, MSE .

F =MSRMSE

=78 .32351 . 4210

=55 .12

106

Page 107: Introduction to Econometrics 0

The estimated residuals equation will be as follows:

ε i= y i−α− β x i

ε i= y i−9 . 67−1 . 41 x i

The equation of the coefficient of determination and correlation coefficient is as follows:

The R2=Explained variation (ESS)Total variation (TSS)

=78 .323585 .42857

=0 .9168 ( to 4 . d . p . ).

The r=√0. 9168=0 . 9575 ( to 4 .d . p .) .

Another formula to calculate the correlation coefficient is as follows:

r=β∗sx

s y=1 .41∗2 .56

3 .77=3 .6096

3 .77=0 . 9575

(to 4.d.p.).

Where : r is the correlation coefficient . s y is the sample standard deviation of y variable . sx is the sample standard deviation of x variable .

The formula to calculate sy is: =STDEV(cell range)The same formula is used to calculate sx

The scatter diagram will be as follows:

Scatter diagram

y = 1.4094x + 9.6739R2 = 0.9168

0

5

10

15

20

25

0 2 4 6 8 10

Marketing expenditures

Com

pany

sal

es

107

Page 108: Introduction to Econometrics 0

Predict the effect on monthly sales when the marketing expenditures rise to 8.

The least - squares regression equation is as follows:

y i=9 .67+1.41∗8y i=9 .67+11.28=20 . 95 (1 )

The variance of the prediction error is given by the following equation:

Variance = σ 2∗[1+ 1

n+( x0− x )2

Sxx ]Where

x0=8x=4 .2857Sxx=39 . 4294n=7

Variance =

σ 2∗[1+ 1n+( x0− x )2

Sxx ]=σ2∗[1+0 .14+(8−4 .2857 )2

39 .4294 ]=1. 49 σ 2 ( to 2.d . p . ).

The standard error of the prediction is as follows:

It is given that RSS = 5.3 and the degrees of freedom are (5 – 2) = 3.

The estimated variance is σ 2= RSS

d . f=5.3

3=1 .767

SE=√1 . 49∗1 .767=1 .62 (2)The confidence interval with 95% confidence level is calculated as follows:

The critical value is found from the t-distribution. With 5 degrees of freedom, the value is 2.015. From equation (1) and (2), we have the following result.

20.95±1 .62∗2.015

Or 20.95 + 3.264 = 24.214

20.95 – 3.264 = 17.686

The estimated points of the confidence interval are (17.686 , 24.214).

108

Page 109: Introduction to Econometrics 0

If marketing expenditures rise to 8,000, then, with 95% confidence level the intervals will be 17686 and 24214 Pounds respectively.Exercise

Please consider a company that believes that its monthly sales may depend on the

number of advertising leaflets distributed each month:

Sales (£000) Leaflets (000s)

30 1026 921 930 1130 1026 920 720 622 615 5

1) Find the equation of the least-squares regression line, assuming that Sales is thedependent variable (y) and Leaflets is the independent variable (x).

2) Find the correlation coefficient between Sales and Leaflets and comment on the result.

3) Predict the effect on monthly sales when the number of leaflets distributed rises

from 12,000 to 15,000. How accurate do you think your prediction is likely to be?

109

Page 110: Introduction to Econometrics 0

Generalized method of moments

The method of moments is based on estimation of population parameters from sample parameters according to the following assumptions.

Population assumption Sample assumptionE(u) = 0 ∑ u i=0Cov(x,u) = 0 ∑ x i ui=0

The estimated error term or residual is υi and the estimated parameters are { α and { β ¿.¿

The regression equation is defined as follows:

y i=a− β x i+εi

The error equation is as follows:ε i= y i−a− β x i (1)

The sample assumption equations could be re-written as follows based on the estimated error equation (1).

∑ εi=0

∑ ( y i− a− β xi )=0We replace ∑ α=n α∑ yi=n a+ β∑ x i

∑ xi ε i=0

∑ xi( y i−a− β x i )=0

∑ xi y i=α∑ x i+ β∑ xi2

Let’s consider an example to understand how we calculate the method of moments. The same results will be shown in EViews 6. I will explain later the steps how to use EViews 6 to calculate the generalized method of moments.

110

Page 111: Introduction to Econometrics 0

Let’s take as an example a financial model related to the returns of a share price in relation to market returns.

Number ofobservations

Share returns

Market returns

1 3.526787 8.732092 -4.34533 -5.198153 5.222709 6.218654 -4.99619 -5.53935 -3.04336 7.698086 -2.375422 -4.997357 2.651303 5.427778 -0.68924 -1.54249 0.205664 1.463910 2.4783 3.652811 0.237407 -0.149412 0.329728 0.1668813 -0.26869 -0.144414 0.064769 0.09787315 -0.5873 -0.0991116 0.329225 -0.0834417 -0.11849 0.12276718 0.011541 -0.4576719 -0.18757 -0.5304620 -0.38752 -0.1111821 -0.26835 -0.2894722 0.262798 -0.1767623 0.355054 -1.1568624 -1.34302 -0.577125 -0.77964 0.57818226 -0.04649 -0.0533127 0.098381 -0.2305428 -0.09585 -0.6662529 -0.0059 -0.5007130 -0.05415 -0.53128

The dependent variable is share price returns expressed in percentage and the independent variable is market price returns expressed in percentage. The next step is to calculate the summations based on the following equations:

∑ yi=n a+ β∑ x i (2)

∑ x i y i=α∑ xi+ β∑ xi2 (3)

By solving equations (2) and (3) we will calculate the parametersα and β .

111

Page 112: Introduction to Econometrics 0

Firstly, plot your data in Excel and use the sum sign ∑. Please double check the calculations with the one that I have done to enhance your learning and increase your motivation to learn more. The calculations are based on the above two equations. The last row is the total for each variable and the error term by running this regression is zero.

Share returns (y)

Market returns (x) Market squared share * market Error term

3.526787 8.73209 76.24939577 30.79622149 0.119395394-4.34533 -5.19815 27.02076342 22.58767714 -1.8637975485.222709 6.21865 38.67160782 32.47819932 2.877858835-4.99619 -5.5393 30.68384449 27.67539527 -2.370438465-3.04336 7.69808 59.26043569 -23.42802875 -6.01363018-2.375422 -4.99735 24.97350702 11.87081513 0.0212234762.651303 5.42777 29.46068717 14.39066288 0.640792535-0.68924 -1.5424 2.37899776 1.063083776 0.246846420.205664 1.4639 2.14300321 0.30107153 -0.1291445822.4783 3.6528 13.34294784 9.05273424 1.218147284

0.237407 -0.1494 0.02232036 -0.035468606 0.5846111610.329728 0.16688 0.027848934 0.055025009 0.543226718-0.26869 -0.1444 0.02085136 0.038798836 0.0764004410.064769 0.097873 0.009579124 0.006339136 0.307440007-0.5873 -0.09911 0.009822792 0.058207303 -0.26135563

0.329225 -0.08344 0.006962234 -0.027470534 0.648544973-0.11849 0.122767 0.015071736 -0.014546662 0.113657220.011541 -0.45767 0.209461829 -0.005281969 0.489064424-0.18757 -0.53046 0.281387812 0.099498382 0.320724953-0.38752 -0.11118 0.012360992 0.043084474 -0.056473111-0.26835 -0.28947 0.083792881 0.077679275 0.1380678990.262798 -0.17676 0.031244098 -0.046452174 0.6215684340.355054 -1.15686 1.33832506 -0.41074777 1.128155735-1.34302 -0.5771 0.33304441 0.775056842 -0.815008271-0.77964 0.578182 0.334294425 -0.450773814 -0.740016695-0.04649 -0.05331 0.002841956 0.002478382 0.2600926990.098381 -0.23054 0.053148692 -0.022680756 0.479886601-0.09585 -0.66625 0.443889063 0.063860063 0.469849348-0.0059 -0.50071 0.250710504 0.002954189 0.489818322-0.05415 -0.53128 0.282258438 0.028768812 0.454491603

-3.82 11.12 307.95 127.03 0

The results that we got are as follows:

112

Page 113: Introduction to Econometrics 0

n = 30, ∑ y i=−3 .82 , ∑ xi=11. 12, ∑ x

i2=307 . 95 , ∑ x i y i=127 . 03

. By plotting the numbers in equations (2) and (3), we get the following results:

∑ yi=n a+ β∑ x i (2)

∑ x i y i=α∑ xi+ β∑ xi2 (3)

-3.82 = 30α +11.12 { β¿ (4)

127.03 = 11.12α +307.95 β (5)

We solve first equation (4) to get the α parameter.

α=−3 .83−11.12 { β30

(6 ) ¿ ¿We then substitute the value of { α into equation (5 ). ¿ ¿ ¿127 .03 = -1 .419653 - 4 .121813 { β+307.95 { β¿ ¿ ¿127 . 03=−1.419653+ 303 .8281867 { β¿ ¿ β=0 .42 ( to 2 .d . p . ) ¿ ¿¿We then substitute the numerical value of β=0 .4228 into equation (6) .

α=−3 .83−11.12∗(0. 42 )30

=−3 .83−4 . 670430

=−0 .28

The generalized method of moments equation and the error term equation are as follows:

y i=−0 .28+0 . 42 x i +ε t

ε t= y i+ 0 .28 -0 . 42xi

The error term equation is referred to within sample prediction errors. The sum of squares of the residuals will be as follows:

113

Page 114: Introduction to Econometrics 0

Error term Error term squared0.119395394 0.01425526

-1.863797548 3.4737413012.877858835 8.282071477

-2.370438465 5.618978517

-6.01363018 36.163747940.021223476 0.0004504360.640792535 0.4106150730.24684642 0.060933155

-0.129144582 0.0166783231.218147284 1.483882807

0.584611161 0.3417702090.543226718 0.295095267

0.076400441 0.0058370270.307440007 0.094519358-0.26135563 0.0683067660.648544973 0.420610581

0.11365722 0.012917964

0.489064424 0.2391840110.320724953 0.102864496

-0.056473111 0.003189212

0.138067899 0.0190627450.621568434 0.3863473181.128155735 1.272735363

-0.815008271 0.664238482-0.740016695 0.547624709

0.260092699 0.0676482120.479886601 0.230291150.469849348 0.220758410.489818322 0.2399219880.454491603 0.2065626170 60.96484018

The interpretation of the generalized method of moments equation is as follows:

When x = 0, then, the intercept gives the value of the estimated dependent variable. In other words, the share price returns in percentage terms are negative and equals to -0.28%. The slope coefficient is 0.42. When xi is changed by a certain amount, then, the dependent variable is changed by Δy=0 . 42 Δx . If the market returns are increased by 1%, then, the share price returns are y i=−0 .28+0 . 42(0. 01 )=−0 . 284+0 . 0042=−0. 28 %

114

Page 115: Introduction to Econometrics 0

Before loading and transferring your data in EViews 6, you have to insert the numerical values or the data in Excel. Do not use long titles for each time – series. Use abbreviations. For example, share for the first time – series and market for the second time - series. Name the sheet of the Excel file and delete the other sheets. For example, the name of the sheet is gm. Once the Excel file is ready, you close it and you open the statistical package EViews 6. You press file and then you select new worksheet. Then, in workfile structure type, select unstructured / undated. Insert the number of observations. In our case, the number is 30. You will get and untitled worksheet. Then press file, then, import, then, read text - lotus – excel. Select the Excel file and press OK. Excel spreadsheet import screen will open. In the box, upper left data cell, select A2. This is the cell that your first observation starts. In Excel 5 + sheet name write gm. This is the name of your filename. In the box name of series, please write share then press space bar and then market. These are the names of the variables.

Once you have done these steps, you will be able to see the time series transferred from Excel to EViews 6. Press file, then, save as. Then, write the filename gm and save it with the extension wf.

To get the output of the generalized method of moments, press quick from the menu at the top of the EViews 6 screen, then estimate equation, then press the option of the generalized method of moments. In equation specification, please write the dependent variable followed by regressors. Thus, you write share c market. In the instrument list, please write market. Then, press OK and you will get the following output. Check the bold numbers with the ones that I have calculated with the previous two equations.

The output of the generalized method of moments in EViews 6 is as follows:

Dependent Variable: SHAREMethod: Generalized Method of MomentsDate: 06/02/15 Time: 11:48Sample: 1 30Included observations: 30Kernel: Bartlett, Bandwidth: Fixed (3), No prewhiteningSimultaneous weighting matrix & coefficient iterationConvergence achieved after: 1 weight matrix, 2 total coef iterationsInstrument list: MARKET

Variable Coefficient Std. Error t-Statistic Prob.

C -0.284046 0.230548 -1.232050 0.2282MARKET 0.422744 0.121375 3.482960 0.0016

R-squared 0.471080    Mean dependent var -0.127295Adjusted R-squared 0.452191    S.D. dependent var 1.993636S.E. of regression 1.475573    Sum squared resid 60.96484Durbin-Watson stat 1.868822    J-statistic 6.84E-49

115

Page 116: Introduction to Econometrics 0

The generalized method of moments equation is as follows:

y i=−0 .28+0 . 42x i+εt

The method of maximum likelihood (ML)

Consider the following regression equation:

y i=α+βx i+εt (1)

The error term is normally distributed and uncrorrelated with the independent variables with zero mean and constant variance. The mathematical formula of the error term is as follows:

ε t= y i−α−βxi (2)

y i is independently and normally distributed in relation to the parameters that constitute the regression equation. The following equation is the density function of a normally distributed yi.

f ( y i)=1

σ √2πexp[− 1

2σ2( y i−α−βx i)

2 ] (3)

The joint density or probability of the y observations can be expressed as the product of n terms for different dependent variables. Therefore, we have the following function:

f ( y1 , y2 , y3 .. .. . , y n)=Πi=1

n

( 12 πσ 2 )

12 exp[− 1

2σ 2 ( y i−α− βxi )2]

(4)

The function of the parameters (α ,β , σ2 )is called the likelihood function. The aim is to maximize the logarithm of the likelihood function, (L).

Log L = ∑i=1

n

[−12

log(2 πσ 2 )− 12 σ2 ( y i−α−βx i )2 ] (5)

The maximum likelihood, (ML) is based on choosing the the values of the unknown parameters that maximize the likelihood function(L). The maximum is found by taking the logarithm on both sides of the likelihood function. Therefore, we have the following mathematical expression:

−n2

log σ2−n2

log (2 π )−12∑

( y i−a−βxi )2

σ2 (6)

116

Page 117: Introduction to Econometrics 0

The last term or part of equation (6) should be minimized in order to maximize the same equation. The aim is to maximize the logarithm of the likelihood function in

relation to α and β and in relation to σ 2. We should focus on the variance in relation to

the intercept and the slope coefficient.

We express the likelihood function in relation to σ from equation (6).

log L(σ )= −n

2log σ2− n

2log (2 π )−

∑ ( y i−α−βx i )2

2 σ2 (7)

We differentiate equation (7) in relation to σ and we equate it to zero. The seond

term is constant and it is ignored namely, −n

2log (2 π )

.

−nσ

+∑ ( y i−α−βxi )

2

σ3 =0

The maximum likelihood for σ2

is as follows:

σ ML2 =

∑ ( y i−α− β xi )2

n

OR

σ ML2 =RSS

n=∑ εi

2

n

The difference with the ordinary least squares, OLS estimator is that the calculated variance is not adjusted for the degrees of freedom. Thus, in small samples the maximum likelihood, ML, estimator is biased. In EViews, the log likelihood is calculated easily. The software will find for the coefficient or the parameter values,

namely α and { β¿ that maximize the likelihood function. It displays the coefficient values in relation to the standard errors, the z-statistic and the probability. At the bottom of the output of the table it shows the log likelihood value in relation to the LR statistic and the related probability. You have to create your own series with a set of statements, then, the software evaluate them during the maximizaion procedure. Please read my article application of a log likelihood object of an ARMA model of well known closed –end funds of major investment banks. You will find it in the word document application of Econometrcis in Finance. You will find attached a copy of the article after the brief example that I will use.

Series of statements that will be evaluated during the maximization procedure

@logl logl1

117

Page 118: Introduction to Econometrics 0

@byeqn

res = lnmpvki - c(1) - c(2) -c(3) -c(4) - c(5)

var = c(4) * c(5)

logl1 = log(@dnorm(res/@sqrt(var))) - log(var) / 2

denom = 1 + exp(res) + exp(var)

@derivstep@ all 1.49e-8 1e-10

@param c(1) 1 c(2) 2 c(3) 1 c(4) 2 c(5)12

grad21 = lnmpvki - exp(res) / denomgrad22 = grad21 * lnmpvkigrad23 = grad21*lnmpvkigrad24 = grad21*lnmpvkigrad25 = grad21 *lnmpvki

The first statement describes and creates te residual series, res, and the second statement creates the variance series, var. The series logl describes the set of log likelihood that will be applied for the specific observation. The @derivstep@all shows the relative step size and the second number is the minimum step size. The @param shows the parameters of the known values that will be included in the model. Finally, grad shows the gradients of the likelihood. To create a likelihood object, select object, then, new object, and type in the command window logl. The likelihood window will open and you enter a list of statements that describe your Econometrics model. Then, press estimate. You will find two options of optimization lgorithms. The Marquardt and the Berndt – Hall – Hall – Hausman. Select one of them. In the derivatives box select accuracy and press OK. To check that the log likelihood function is correctly maximized, click on view gradients and then summary in the output box.

Please find attached a copy of the article.

118

Page 119: Introduction to Econometrics 0

Application of a log likelihood object of an ARMA model of well – known closed – end funds of major investment banks.

Preface

In this article, we have applied an autoregressive moving average, ARMA (2,2) model of order AR(1), AR(2), MA(1), MA(2) and SMA(12) to test the natural logarithmic monthly market returns of the of the closed – end funds of major investment banks such as Van Kampen income trust, Aberden Asia Pacific income fund, Credit Suisse asset management income, Scudder high income and Templeton global income. The purpose of this article is to eliminate the seasonality, to compare and select the maximum value of the log likelihood estimation of the different closed-end funds. We have selected the model with the best forecasting ability in terms of the lowest value of the Akaike information criterion, the Schwarz criterion, the Hannan – Quinn criterion. The software that we have used is EViews 6. We have found that the natural logarithmic monthly market price returns of all closed-end funds are a stationary series. We have concluded that the best fit model is Scudder high income, (LNMPSHI), closed-end fund as it has the maximum log likelihood value of -123.16 and the Akaike information criterion, the Schwarz criterion, and the Hannan – Quinn

119

Page 120: Introduction to Econometrics 0

criterion are 1.65, 1.55 an 1.61 respectively. In terms of gradients at the estimated parameters, we have found that there are outlier values and significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. The gradients of the log likelihood function have been computed at the estimated parameters and the unsuccessful estimation are displayed through the outliers of the peak of the graph of the four closed-end funds. Credit Suisse asset management income, (LNMPCS), closed-end fund displayed less outlier values and not significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. Finally, the analytic derivatives were calculated based on the specified values. The real and minimum step sizes are identical for all coefficients vectors and very close to zero. We have used one-sided numeric derivative. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market returns observations are 143.The data was obtained from Thomson Financial Investment View database.

Keywords: Autoregressive moving average, (ARMA) model, log likelihood function, (Logl), gradients estimation, analytic derivatives, logarithmic market price returns.

Introduction

The purpose of this article by applying an autoregressive moving average, (ARMA) model is twofold. Firstly, we aim to eliminate the seasonal patterns of the time series of closed-end funds by applying monthly seasonal moving average, SMA(12), term. Due to the fact that the autocorrelations did not declined geometrically, we have used a first and a second order autoregressive and moving average terms. Secondly, we aim to compare the log likelihood object of each closed-end funds through the estimates of an iterative algorithm. We have estimated the regression equations and then the coefficient vectors that include the regression estimates in terms of C(1), C(2), C(3), C(4) and C(5) were used to calculate the log likelihood. The time series that we have used are the monthly natural logarithmic market prices returns of the Van Kampen income trust, Aberdeen Asia Pacific income fund, Credit Suisse asset management income, Scudder high income and Templeton global income.

Evaluation of the forecasting performance of the various types of ARMA models and their log likelihood estimation will be based on the information criterion. Specifically, the Akaike information criteria, (AIC), the Schwarz information criteria, (SIC), and

120

Page 121: Introduction to Econometrics 0

the Hannan – Quinn criterion, (HQC) will be used to choose the best model that minimize the error term.

The rest of the paper is organized as follows. Section 1 describes the methodological issues and data explanations. Section 2 shows the results of statistical and econometrical tests and Section 3 summarizes and concludes.

1. Methodological issues and data explanations.

In this article, we are going to test the natural logarithmic monthly market prices returns to determine the monthly seasonal moving average, SMA(12), term, the autoregressive, (AR), and moving average, (MA), terms. Then, we aim to compare the log likelihood object of each closed-end funds through the estimates of an iterative algorithm and conclude based on the values of the log likelihood, which model is best fit. The methodology was based on the application of autoregressive moving average, (ARMA), models. Application of such models has been developed by different academics such as: Andrews, (1991), Bhargava, (1986), Box and Gwilym, (1976), Davidson and James, (1993), Dickey and Fuller, (1979), Elliott, Thomas and James, (1996), Fischer, (1932), Greene, (1997), Hamilton, (1994a), Hayashi, (2000), Johnston and John, (1997), Kwiatkowski, Denis, Peter and Yongcheol, (1992), Maddala and Wu, (1999), Newey, Whitney and West, (1994), Serena and Perron, (2001), Phillips and Perron, (1988), Rao and Griliches, (1969), Said and David, (1984), Vuong, Q.H., (1989),

In this article, we are going to apply an ARMA model of order AR(p) and MA(q). Specifically, the order of the model is AR(1), AR(2) and MA(1), MA(2), It is aimed to eliminate seasonality and estimate the log likelihood values of the Van Kampen income trust, Aberdeen Asia Pacific income fund, Credit Suisse asset management income, Scudder high income and Templeton global income closed-end funds.

121

Page 122: Introduction to Econometrics 0

The mathematical equation of the ARMA model is as follows:

y t=a+∑i=1

p

ρi y t−i+ε t+∑i=1

q

ωi εt−i (1)

Equation (1) is derived from the addition of equation (2) and equation (3).

The mathematical equation of an AR(p) model is as follows:

yt=α+ρ1 y t−1+ρ2 y t−2+ .. .. . .. .+ρi y t−i+εt OR

y t=a+∑i=1

p

ρi y t−i+ε t (2)

Where: yt is the dependent variable and in our case are the logarithmic monthly returns of the market prices of the closed-end funds. yt-i is the lagged dependent variable of order i. α is a constant .

ρi are coefficients.

ε t is the error term.

The mathematical equation of an MA (q) model is as follows:

y t=εt+ω1 εt−1+ω2 εt−2+. .. .. . .. .+ωi εt−i OR

y t=εt+∑i=1

q

ωi εt−i (3)

Where: y t , is the logarithmic monthly returns of the market prices of the closed-end funds . ωi are coefficients.

ε t is the error term.

ε t−i is the lagged error term of order i .

122

Page 123: Introduction to Econometrics 0

According to EViews User’s Guide II,(2007, pp.71-72), The serial correlation of the residuals of the ARMA model is an addition of AR(p), I(d) and MA(q) term. We will illustrate first the residuals of the autoregressive model of order AR(p).

υt=ρ1 υt−1+ρ2 υt−2+. .. .+ρi υt−i (4)

Where: υt is the residual term.

ρi are the coefficients.

υt−i are the residuals lagged one term .

The residuals of the moving average model of order MA(q) is as follows:

υ t =εt+θ1 εt−1+θ2 εt−2+. .. . .. .+θi ε t−i (5)

Where: υt is the residual term.

θi are the coefficients.

ε t−i is the error term lagged one period .

The residuals of the ARMA model is the addition of equation (4) and (5).

υt=ρ1υt−1+ρ2 υt−2+. .. .+ρi υt−i+εt +θ1 ε t−1+θ2 εt−2+. . .. ..+θ i εt−i (6)

According to EViews User’s Guide II,(2007, p.284), equation (7) is based on the

conditional heteroskedasticity regression equation of of y t=β1+ β2 x t +β3 zt+εt . The log likelihood function for a sample of T observations is as follows:

l ( β ,α , σ )=−T2

( log (2 π )+log σ2 )−a2 ∑

t=1

T

log (zt )−∑t=1

T ( yt−β1−β2 xt−β3 zt )2

σ 2 zta

=∑t=1

T {log φ( yt−β1−β2 xt−β3 zt

σ zta/2 )−1

2log ( σ2 zt

a )} (7)

Where:

123

Page 124: Introduction to Econometrics 0

φ is the standard normal density function . x t , y t and z t are data series . β1 , β2 , β3 , σ , and α are the parameters of the model .

We will modify equation (7) and construct equation (8) to fit the ARMA models of equations (9), (10), (11), (12), and (13). Thus, the modified equation is as follows:

l( β ,α ,σ )=∑t=1

T {log φ( y1−β1( AR1 )−β2( AR2 )−β3 (MA 1)−β4 (MA 2)−β5( SMA 12)

σ SMA12ta/2 )−1

2log (σ 2SMA 12t

a )}

(8)Where:

y t is the natural logarithmic monthly market price returns of the Van Kampen income trust, (LNMPVKI), Aberdeen Asia Pacific income fund, (LNMPAAP), Credit Suisse asset management income, (LNMPCS), Scudder high income, (LNMPSHI) and Templeton global income, (LNMPTGI) closed-end funds. β1 , β2 , β3 , β4 , β5 are the coefficients of the autoregressive, moving average and seasonal moving average terms of AR (1) , AR (2) , MA (1 ), MA (2) , SMA(12 ).

The following equations show specification examples of the text screen of the log likelihood and ARMA, (2,2) after including the seasonal term, SMA(12), in EViews 6. The likelihood specification include three equations. The first equation is a residual equation or series, (RES). The second equation is a variance equation or series, (VAR) and the third statement is the logl1 specification of the model of each closed-end fund. In addition, we have specified the derivstep in terms of @all, the relative step size and the minimum step size. The @param is used to specifiy the values that will be used in the coefficient vectors C. Finally, the gradients of the likelihood were specified for the 5 coefficient vectors. The keyword @byeqn is used to express evaluation by equation and not by observations. The natural logarithmic monthly market price returns of the Van Kampen income trust, (LNMPVKI), Aberdeen Asia Pacific income fund, (LNMPAAP), Credit Suisse asset management income, (LNMPCS), Scudder high income, (LNMPSHI) and Templeton global income, (LNMPTGI) closed-end funds are as follows:

LNMPVKI = AR(1) + AR(2) + MA(1) + MA(2) + SMA (12) (9)

@logl logl1

@byeqn

res = lnmpvki - c(1) - c(2) -c(3) -c(4) - c(5)

var = c(4) * c(5)

124

Page 125: Introduction to Econometrics 0

logl1 = log(@dnorm(res/@sqrt(var))) - log(var) / 2

denom = 1 + exp(res) + exp(var)

@derivstep@ all 1.49e-8 1e-10

@param c(1) 1 c(2) 2 c(3) 1 c(4) 2 c(5)12

grad21 = lnmpvki - exp(res) / denomgrad22 = grad21 * lnmpvkigrad23 = grad21*lnmpvkigrad24 = grad21*lnmpvkigrad25 = grad21 *lnmpvki

LNMPAAP = AR(1) + AR(2) + MA(1) + MA(2) + SMA (12) (10)

@logl logl1

@byeqn

res = lnmpaap - c(1) - c(2) -c(3) -c(4) - c(5)

var = c(4) * c(5)

logl1 = log(@dnorm(res/@sqrt(var))) - log(var) / 2

denom = 1 + exp(res) + exp(var)

@derivstep@ all 1.49e-8 1e-10

@param c(1) 1 c(2) 2 c(3) 1 c(4) 2 c(5)12

grad21 = lnmpaap - exp(res) / denomgrad22 = grad21 * lnmpaapgrad23 = grad21*lnmpaapgrad24 = grad21*lnmpaapgrad25 = grad21 *lnmpaap

LNMPCS = AR(1) + AR(2) + MA(1) + MA(2) + SMA (12) (11)

@logl logl1

@byeqn

res = lnmpcs - c(1) - c(2) -c(3) -c(4) - c(5)

var = c(4) * c(5)

125

Page 126: Introduction to Econometrics 0

logl1 = log(@dnorm(res/@sqrt(var))) - log(var) / 2

denom = 1 + exp(res) + exp(var)

@derivstep@ all 1.49e-8 1e-10

@param c(1) 1 c(2) 2 c(3) 1 c(4) 2 c(5)12

grad21 = lnmpcs - exp(res) / denomgrad22 = grad21 * lnmpcsgrad23 = grad21*lnmpcsgrad24 = grad21*lnmpcsgrad25 = grad21 *lnmpcs

LNMPSHI = AR(1) + AR(2) + MA(1) + MA(2) + SMA (12) (12)

@logl logl1

@byeqn

res = lnmpshi - c(1) - c(2) -c(3) -c(4) - c(5)

var = c(4) * c(5)

logl1 = log(@dnorm(res/@sqrt(var))) - log(var) / 2

denom = 1 + exp(res) + exp(var)

@derivstep@ all 1.49e-8 1e-10

@param c(1) 1 c(2) 2 c(3) 1 c(4) 2 c(5)12

grad21 = lnmpshi - exp(res) / denomgrad22 = grad21 * lnmpshigrad23 = grad21*lnmpshigrad24 = grad21*lnmpshigrad25 = grad21 *lnmpshi

LNMPTGI = AR(1) + AR(2)+ MA(1) + MA(2) + SMA (12) (13)

@logl logl1

@byeqn

res = lnmptgi - c(1) - c(2) -c(3) -c(4) - c(5)

var = c(4) * c(5)

126

Page 127: Introduction to Econometrics 0

logl1 = log(@dnorm(res/@sqrt(var))) - log(var) / 2

denom = 1 + exp(res) + exp(var)

@derivstep@ all 1.49e-8 1e-10

@param c(1) 1 c(2) 2 c(3) 1 c(4) 2 c(5)12

grad21 = lnmptgi - exp(res) / denomgrad22 = grad21 * lnmptgigrad23 = grad21*lnmptgigrad24 = grad21*lnmptgigrad25 = grad21 *lnmptgi

Where: AR are autoregressive terms.

MA are moving average terms.

SMA is the seasonal moving average term.

According to EViews User’s Guide II,(2007, p.289), We have used one-sided numeric derivative and it is expressed as follows:

f (θ( i )+s( i+1))− f (θ( i ))s(i+1)

(14)

Where:

θ( i) denote the value of the parameter θ at iteration i .The step size at iteration i+1 is s( i+1)=maz (rθ(i ) , m).f is the likelihood function .

According to EViews User’s Guide II,(2007, p.645), the mathematical equations of the Akaike information criteria, (AIC), the Schwarz information criteria, (SIC), and the Hannan – Quinn criterion, (HQC), are as follows:

Akaike information criterion, (AIC) = −2( l /T )+2( K /T ) (15)Schwarz criterion, (SC) = −2( l /T )+K log (T )/T (16)Hannan – Quinn criterion (HQC) = −2( l /T )+2 k log ( log (T ) )/T (17)

Where l is the value of the log likelihood function with the k parameters estimated using T observations.

127

Page 128: Introduction to Econometrics 0

The discount / premium on a closed-end fund is calculated as the difference between the market price for month t and market price for month t-1 divided by the market price for month t-1.

Discount / premium =Market Pricet−Market Price t-1

Market Pricet-1

(18)

The logarithmic formula is as follows:

Rt= ln( Pt /P t−1 ) (19)

Where: Rt is the monthly return for month t, Pt is the closing price for month t, and Pt-1 is the closing price lagged one period for month t-1.

The data that we have used are monthly starting from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly NAV returns observations are 143. The data was obtained from Thomson Financial Investment View database. We have checked for stationarity of the monthly market prices returns by applying an ADF test.

We illustrate for simplicity the Figure for one closed-end fund. Figure 1, shows the fluctuations of the monthly percentage changes of the market price returns of the Credit Suisse asset management income closed-end fund for the period starting from 31/01/1990 to 31/12/2001. The changes represent the discount / premium of the fund.

128

Page 129: Introduction to Econometrics 0

Discount / Premium of Credit Suisse asset management

-15

-10

-5

0

5

10

15

20

25

30

Feb-9

0

Feb-9

1

Feb-9

2

Feb-9

3

Feb-9

4

Feb-9

5

Feb-9

6

Feb-9

7

Feb-9

8

Feb-9

9

Feb-0

0

Feb-0

1

Date

Perc

enta

ge re

turn

s

Source: Author’s calculation based on Excel software. Data were obtained from Thomson Financial Investment View database.

According to Figure 1, there was a large percentage monthly fluctuations of the market prices of the Credit Suisse asset management income. Specifically, in 31/3/1990, the percentage changes in the market prices of the closed - end fund was in discount and accounted to -6.45%. In 31/10/1990, the fund changes of the market prices were in premium and the value was 4.08%. In 30/11/1993, the discount was -9.67%, and in 30/09/1996, the value of the fund was in discount and accounted as -7.29%. In 31/01/2001, the fund recorded the highest premium and accounted as 23.64%. The fluctuations of the market prices in terms of premium and discount are measured through the ARMA model to estimate expectations about the market price returns and the maximum value of the log likelihood object.

129

Page 130: Introduction to Econometrics 0

2. Statistical and econometrical tests.

Table 1 shows the ADF test of the natural logarithmic monthly market price returns of the Van Kampen income trust closed-end fund, (lNMPVKI). The total observations are 144 and the logarithmic monthly market prices returns observations are 143. The data was obtained from Thomson Financial Investment View database.

Null Hypothesis: LNMPVKI has a unit rootExogenous: ConstantLag Length: 0 (Automatic based on SIC, MAXLAG=13)

t-Statistic   Prob.*

Augmented Dickey-Fuller test statistic -13.91430  0.0000Test critical values: 1% level -3.476805

5% level -2.88183010% level -2.577668

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test EquationDependent Variable: D(LNMPVKI)Method: Least SquaresDate: 10/26/13 Time: 14:39Sample (adjusted): 1990M03 2001M12Included observations: 142 after adjustments

Variable Coefficient Std. Error t-Statistic Prob.

LNMPVKI(-1) -1.096340 0.078792 -13.91430 0.0000C -0.000594 0.002913 -0.203741 0.8389

R-squared 0.580346    Mean dependent var 0.001013Adjusted R-squared 0.577348    S.D. dependent var 0.053357S.E. of regression 0.034688    Akaike info criterion -3.870853Sum squared resid 0.168458    Schwarz criterion -3.829221Log likelihood 276.8305    Hannan-Quinn criter. -3.853935F-statistic 193.6078    Durbin-Watson stat 1.896357Prob(F-statistic) 0.000000

Source: Author’s calculation based on EViews 6 software.

For a level of significance of one per cent, the critical value of the t-statistic from Dickey-Fuller’s table is -3.48. According to Table 1 and to the sample evidence, we can reject the null hypothesis namely the existence of a unit root with one, five and ten per cent significance level. The ADF test statistic is -13.91, which is smaller than the critical values, (-3.48, -2.88, -2.58). In other words, the natural logarithmic monthly market returns of the Van Kampen income closed-end fund is a stationary series.

Table 2 shows the ADF test of the natural logarithmic monthly market price returns of the Aberdeen Asia Pacific income fund, (LNMPAAP). The total

130

Page 131: Introduction to Econometrics 0

observations are 144 and the logarithmic monthly market prices returns observations are 143. The data was obtained from Thomson Financial Investment View database.

Null Hypothesis: LNMPAAP has a unit rootExogenous: ConstantLag Length: 0 (Automatic based on SIC, MAXLAG=13)

t-Statistic   Prob.*

Augmented Dickey-Fuller test statistic -11.52110  0.0000Test critical values: 1% level -3.476805

5% level -2.88183010% level -2.577668

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test EquationDependent Variable: D(LNMPAAP)Method: Least SquaresDate: 10/26/13 Time: 14:41Sample (adjusted): 1990M03 2001M12Included observations: 142 after adjustments

Variable Coefficient Std. Error t-Statistic Prob.

LNMPAAP(-1) -0.980552 0.085109 -11.52110 0.0000C -0.005844 0.003866 -1.511868 0.1328

R-squared 0.486683    Mean dependent var -0.000449Adjusted R-squared 0.483016    S.D. dependent var 0.063592S.E. of regression 0.045724    Akaike info criterion -3.318424Sum squared resid 0.292690    Schwarz criterion -3.276792Log likelihood 237.6081    Hannan-Quinn criter. -3.301506F-statistic 132.7358    Durbin-Watson stat 1.984939Prob(F-statistic) 0.000000

Source: Author’s calculation based on EViews 6 software.

For a level of significance of one per cent, the critical value of the t-statistic from Dickey-Fuller’s table is -3.48. According to Table 2 and to the sample evidence, we can reject the null hypothesis namely the existence of a unit root with one, five and ten per cent significance level. The ADF test statistic is -11.52, which is smaller than the critical values, (-3.48, -2.88, -2.58). In other words, the natural logarithmic monthly returns of the Aberden Asia Pacific income closed-end fund is a stationary series.

Table 3 shows the ADF test of the natural logarithmic monthly market price returns of the Credit Suisse asset management income closed-end fund, (LNMPCS). The total observations are 144 and the logarithmic monthly market

131

Page 132: Introduction to Econometrics 0

prices returns observations are 143. The data was obtained from Thomson Financial Investment View database.

Null Hypothesis: LNMPCS has a unit rootExogenous: ConstantLag Length: 0 (Automatic based on SIC, MAXLAG=13)

t-Statistic   Prob.*

Augmented Dickey-Fuller test statistic -11.63889  0.0000Test critical values: 1% level -3.476805

5% level -2.88183010% level -2.577668

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test EquationDependent Variable: D(LNMPCS)Method: Least SquaresDate: 10/26/13 Time: 14:42Sample (adjusted): 1990M03 2001M12Included observations: 142 after adjustments

Variable Coefficient Std. Error t-Statistic Prob.

LNMPCS(-1) -1.019636 0.087606 -11.63889 0.0000C -0.003161 0.003619 -0.873409 0.3839

R-squared 0.491766    Mean dependent var -0.000764Adjusted R-squared 0.488136    S.D. dependent var 0.060176S.E. of regression 0.043053    Akaike info criterion -3.438791Sum squared resid 0.259497    Schwarz criterion -3.397160Log likelihood 246.1542    Hannan-Quinn criter. -3.421874F-statistic 135.4637    Durbin-Watson stat 1.933443Prob(F-statistic) 0.000000

Source: Author’s calculation based on EViews 6 software.

For a level of significance of one per cent, the critical value of the t-statistic from Dickey-Fuller’s table is -3.48. According to Table 3 and to the sample evidence, we can reject the null hypothesis namely the existence of a unit root with one, five and ten per cent significance level. The ADF test statistic is -11.64, which is smaller than the critical values, (-3.48, -2.88, -2.58). In other words, the natural logarithmic monthly returns of the Credit Suisse asset management income closed-end fund is a stationary series.

Table 4 shows the ADF test of the natural logarithmic monthly market price returns of the Scudder high income, (LNMPSHI) closed-end fund. The total observations are 144 and the logarithmic monthly market prices returns observations are 143. The data was obtained from Thomson Financial Investment View database.

132

Page 133: Introduction to Econometrics 0

Null Hypothesis: LNMPSHI has a unit rootExogenous: ConstantLag Length: 0 (Automatic based on SIC, MAXLAG=13)

t-Statistic   Prob.*

Augmented Dickey-Fuller test statistic -11.70455  0.0000Test critical values: 1% level -3.476805

5% level -2.88183010% level -2.577668

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test EquationDependent Variable: D(LNMPSHI)Method: Least SquaresDate: 10/26/13 Time: 14:43Sample (adjusted): 1990M03 2001M12Included observations: 142 after adjustments

Variable Coefficient Std. Error t-Statistic Prob.

LNMPSHI(-1) -0.985690 0.084214 -11.70455 0.0000C -0.000954 0.003823 -0.249630 0.8032

R-squared 0.494579    Mean dependent var -0.000432Adjusted R-squared 0.490968    S.D. dependent var 0.063854S.E. of regression 0.045557    Akaike info criterion -3.325706Sum squared resid 0.290566    Schwarz criterion -3.284075Log likelihood 238.1251    Hannan-Quinn criter. -3.308789F-statistic 136.9966    Durbin-Watson stat 1.987101Prob(F-statistic) 0.000000

Source: Author’s calculation based on EViews 6 software.

For a level of significance of one per cent, the critical value of the t-statistic from Dickey-Fuller’s table is -3.48. According to Table 4 and to the sample evidence, we can reject the null hypothesis namely the existence of a unit root with one, five and ten per cent significance level. The ADF test statistic is -11.70, which is smaller than the critical values, (-3.48, -2.88, -2.58). In other words, the natural logarithmic monthly returns of the Scudder high income closed-end fund is a stationary series.

Table 5 shows the ADF test of the natural logarithmic monthly market price returns of the Templeton global income, (LNMPTGI) closed-end fund. The total observations are 144 and the logarithmic monthly market prices returns observations are 143. The data was obtained from Thomson Financial Investment View database.

Null Hypothesis: LNMPTGI has a unit root

133

Page 134: Introduction to Econometrics 0

Exogenous: ConstantLag Length: 0 (Automatic based on SIC, MAXLAG=13)

t-Statistic   Prob.*

Augmented Dickey-Fuller test statistic -13.31403  0.0000Test critical values: 1% level -3.476805

5% level -2.88183010% level -2.577668

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test EquationDependent Variable: D(LNMPTGI)Method: Least SquaresDate: 10/26/13 Time: 14:44Sample (adjusted): 1990M03 2001M12Included observations: 142 after adjustments

Variable Coefficient Std. Error t-Statistic Prob.

LNMPTGI(-1) -1.114683 0.083722 -13.31403 0.0000C -0.002271 0.002730 -0.831818 0.4069

R-squared 0.558726    Mean dependent var 5.74E-05Adjusted R-squared 0.555574    S.D. dependent var 0.048701S.E. of regression 0.032467    Akaike info criterion -4.003206Sum squared resid 0.147574    Schwarz criterion -3.961574Log likelihood 286.2276    Hannan-Quinn criter. -3.986289F-statistic 177.2635    Durbin-Watson stat 2.024262Prob(F-statistic) 0.000000

Source: Author’s calculation based on EViews 6 software.

For a level of significance of one per cent, the critical value of the t-statistic from Dickey-Fuller’s table is -3.48. According to Table 5 and to the sample evidence, we can reject the null hypothesis namely the existence of a unit root with one, five and ten per cent significance level. The ADF test statistic is -13.31, which is smaller than the critical values, (-3.48, -2.88, -2.58). In other words, the natural logarithmic monthly returns of the Templeton global income is a stationary series.

Table 6 shows the results of the log likelihood estimation of an ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Aberdeen Asia Pacific income fund, (LNMPAAP) closed-end

134

Page 135: Introduction to Econometrics 0

fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (10).

LogL: LNMPAAPMethod: Maximum Likelihood (Marquardt)Date: 11/02/13 Time: 18:41Sample: 1 143Included observations: 143Evaluation order: By equationInitial Values: C(1)=-0.35707, C(2)=0.64293, C(3)=-0.36607, C(4)=0.06173,C(5)=0.0226Convergence achieved after 1 iteration

Coefficient Std. Error z-Statistic Prob.

C(1) -0.357065 0.047864 -7.459990 0.0256C(2) 0.642932 0.051195 12.558492 0.0300C(3) -0.366065 0.040849 -8.961418 0.0000C(4) 0.061726 0.010296 5.995143 0.0000C(5) 0.022597 0.089435 0.252663 0.8290

Log likelihood -215.0806    Akaike info criterion 2.938190Avg. log likelihood -1.504060    Schwarz criterion 2.834594Number of Coefs. 5    Hannan-Quinn criter. 2.896094

Source: Author’s calculation based on EViews 6 software.Significant p-value at the 5% significance level.

By applying equation (10) to Table 6 we have the following results.

Logl= −0 . 36 + 0 . 64 - 0 .37 +0 . 06 + 0 . 02 (-7.46 ) (12 .56 ) ( -8. 96 ) (5 . 995) (0 .25 )

According to Table 6, the coefficients of the vectors C(1), C(2), C(3) and C(4) are statistically significant at the 5% significance level. The displayed p-values are less than the 5% significance level. The vector coefficient C(5) is 0.02 and the p-value is 0.83. The log likelihood estimation is -215.08. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 2.94, 2.83 and 2.90 respectively.

Graph 1 shows the gradients of the log likelihood estimation of an ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Aberdeen Asia Pacific income fund, (LNMPAAP) closed-end

135

Page 136: Introduction to Econometrics 0

fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (10).

-160

-120

-80

-40

0

40

80

25 50 75 100 125

C(1)

-160

-120

-80

-40

0

40

80

25 50 75 100 125

C(2)

-160

-120

-80

-40

0

40

80

25 50 75 100 125

C(3)

-40

0

40

80

120

160

25 50 75 100 125

C(4)

-100

0

100

200

300

400

500

600

25 50 75 100 125

C(5)

Gradients at estimated parameters

Source: Author’s calculation based on EViews 6 software.

According to graph 1, there are outlier values and significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. For example, at the 103 observation, the C(1) has a value of -156.24 and

136

Page 137: Introduction to Econometrics 0

the coefficient vector C(5) has a value of 574.98. The previous value for C(5) at observation 102 was -28.76. The gradients of the log likelihood function have been computed at the estimated parameters and the unsuccessful estimation are displayed through the outliers of the peak of the graph.

Table 7 shows the results of the analytic derivatives of the log likelihood estimation. It is an ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Aberdeen Asia Pacific income fund, (LNMPAAP) closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (10).

Log-Likelihood derivative testingLogL: LOGL01Two-sided accurate numeric derivativesEvaluated at estimated parameters

Coefficient User Rel. Step Min. Step Coef. ValueC(1) GRAD21 1.49E-08 1.00E-10 -0.357065C(2) GRAD22 1.49E-08 1.00E-10 0.642932C(3) GRAD23 1.49E-08 1.00E-10 -0.366065C(4) GRAD24 1.49E-08 1.00E-10 0.061726C(5) GRAD25 1.49E-08 1.00E-10 0.022597

Source: Author’s calculation based on EViews 6 software.

According to Table 7 and due to the fact that the logl specification contains a @param, the analytic derivatives were calculated based on the specified values. The real and minimum step sizes are identical for all coefficients vectors and very close to zero. We have used one-sided numeric derivative.

Table 8 shows the results of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market

137

Page 138: Introduction to Econometrics 0

price returns of the Credit Suisse asset management income, (LNMPCS), closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (11).

LogL: LNMPCSMethod: Maximum Likelihood (Marquardt)Date: 11/02/13 Time: 18:47Sample: 1 143Included observations: 143Evaluation order: By equationInitial Values: C(1)=-1.1025, C(2)=-0.1025, C(3)=-1.21368, C(4)=1.08013,C(5)=1.17984Convergence achieved after 4 iteration

Coefficient Std. Error z-Statistic Prob.

C(1) -1.102496 0.376880 -2.925323 0.0531C(2) -0.102498 0.353221 -0.290180 0.6871C(3) -1.213679 0.380799 -3.187190 0.0280C(4) 1.080125 0.354304 3.048582 0.0323C(5) 1.179843 0.104115 11.332113 0.0311

Log likelihood -148.9558    Akaike info criterion 2.153228Avg. log likelihood -1.041649    Schwarz criterion 2.256824Number of Coefs. 5    Hannan-Quinn criter. 2.195324

Source: Author’s calculation based on EViews software.Significant p-value at the 5% significance level.

By applying equation (11) to Table 8 we have the following results.

Logl=−1 .102 − 0 .102 - 1 . 214 + 1. 08 + 1 .18 ( -2. 93) (-0 .29 ) ( -3. 19) (3. 05 ) (11. 33)

According to Table 8, the coefficient of the vector C(2) is not statistically significant. The coefficients C(1), C(3), C(4) and C(5) are statistically significant at the 5% significance level. The displayed p-values are smaller than the 5% significance level. The log likelihood estimation is -148.96. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 2.15, 2.26 and 2.20 respectively.

Graph 2 shows the gradients of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Credit Suisse asset management income, (LNMPCS), closed-

138

Page 139: Introduction to Econometrics 0

end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (11).

-.10

-.05

.00

.05

.10

.15

.20

.25

25 50 75 100 125

C(1)

-.10

-.05

.00

.05

.10

.15

.20

.25

25 50 75 100 125

C(2)

-.10

-.05

.00

.05

.10

.15

.20

.25

25 50 75 100 125

C(3)

-.55

-.50

-.45

-.40

-.35

-.30

-.25

-.20

25 50 75 100 125

C(4)

-.50

-.45

-.40

-.35

-.30

-.25

-.20

-.15

25 50 75 100 125

C(5)

Gradients at estimated parameters

Source: Author’s calculation based on EViews 6 software.

139

Page 140: Introduction to Econometrics 0

According to graph 2, there are less outlier values and not significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. For example, at the 103 observation, the C(1) has a value of -0.07 and the coefficient vector C(5) has a value of -0.49. The previous value for C(5) at observation 102 was -0.39. The gradients of the log likelihood function have been computed at the estimated parameters and the unsuccessful estimation are displayed through the outliers.

Table 9 shows the results of the analytic derivatives of the log likelihood estimation. of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Credit Suisse asset management income, (LNMPCS), closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (11).

Log-Likelihood derivative testingLogL: LOGL02Two-sided accurate numeric derivativesEvaluated at estimated parameters

Coefficient User Rel. Step Min. Step Coef. ValueC(1) GRAD21 1.49E-08 1.00E-10 -1.102496C(2) GRAD22 1.49E-08 1.00E-10 -0.102498C(3) GRAD23 1.49E-08 1.00E-10 -1.213679C(4) GRAD24 1.49E-08 1.00E-10 1.080125C(5) GRAD25 1.49E-08 1.00E-10 1.179843

Source: Author’s calculation based on EViews 6 software.

According to Table 9 and due to the fact that the logl specification contains a @param, the analytic derivatives were calculated based on the specified values. The real and minimum step sizes are identical for all coefficients vectors and very close to zero. We have used one-sided numeric derivative.

Table 10 shows the results of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market

140

Page 141: Introduction to Econometrics 0

price returns of Scudder high income, (LNMPSHI) closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (12).

LogL: LNMPSHIMethod: Maximum Likelihood (Marquardt)Date: 11/02/13 Time: 18:52Sample: 1 143Included observations: 143Evaluation order: By equationInitial Values: C(1)=-0.50667, C(2)=0.49333, C(3)=-0.46789, C(4)=0.39707,C(5)=0.03172Convergence achieved after 2 iteration

Coefficient Std. Error z-Statistic Prob.

C(1) -0.506669 0.035431 -14.300160 0.0000C(2) 0.493331 0.035024 14.085512 0.0000C(3) -0.467890 0.013535 -34.568895 0.0000C(4) 0.397072 0.010052 39.501790 0.0000C(5) 0.031715 0.090754 0.349461 0.0885

Log likelihood -123.1553    Akaike info criterion 1.652521Avg. log likelihood -0.861226    Schwarz criterion 1.548925Number of Coefs. 5    Hannan-Quinn criter. 1.610425

Source: Author’s calculation based on EViews 6 software.Significant p-value at the 5% significance level.

By applying equation (12) to Table 10 we have the following results.

Logl=−0 .51 + 0 . 49 - 0 . 47 + 0 . 40 +0 . 03 ( -14 .30) (14 . 09) ( -34 .57) (39. 50 ) (0 .35 )

According to Table 10, the vector coefficients of the C(1), C(2), C(3), C(4) are statistically significant at the 5% significance level of Scudder high income, (LNMPSHI) closed-end fund. The displayed p-values are less than the 5% significance level. The C(5) coefficient is not statistically significant. The coefficient of the C(5) is not statistically significant as the p-value is 0.09. The log likelihood estimation is -123.16. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 1.65, 1.55 and 1.61 respectively.

Graph 3 shows the gradients of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market

141

Page 142: Introduction to Econometrics 0

price returns of Scudder high income, (LNMPSHI) closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (12).

-12

-8

-4

0

4

8

12

16

20

25 50 75 100 125

C(1)

-12

-8

-4

0

4

8

12

16

20

25 50 75 100 125

C(2)

-12

-8

-4

0

4

8

12

16

20

25 50 75 100 125

C(3)

-10

0

10

20

30

25 50 75 100 125

C(4)

-20

0

20

40

60

80

25 50 75 100 125

C(5)

Gradients at estimated parameters

Source: Author’s calculation based on EViews 6 software.

142

Page 143: Introduction to Econometrics 0

According to graph 3, there are outlier values and significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. For example, at the 104 observation, the C(1) has a value of 18.95 and the coefficient vector C(5) has a value of 74.44. The previous value for C(5) at observation 103 was -5.93. The gradients of the log likelihood function have been computed at the estimated parameters and the unsuccessful estimation are displayed through the outliers.

Table 11 shows the results of the analytic derivatives of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of Scudder high income, (LNMPSHI) closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (12).

Log-Likelihood derivative testingLogL: LOGL03Two-sided accurate numeric derivativesEvaluated at estimated parameters

Coefficient User Rel. Step Min. Step Coef. ValueC(1) GRAD21 1.49E-08 1.00E-10 -0.506669C(2) GRAD22 1.49E-08 1.00E-10 0.493331C(3) GRAD23 1.49E-08 1.00E-10 -0.467890C(4) GRAD24 1.49E-08 1.00E-10 0.397072C(5) GRAD25 1.49E-08 1.00E-10 0.031715

Source: Author’s calculation based on EViews 6 software.

According to Table 11 and due to the fact that the logl specification contains a @param, the analytic derivatives were calculated based on the specified values. The real and minimum step sizes are identical for all coefficients vectors and very close to zero. We have used one-sided numeric derivative.

Table 12 shows the results of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market

143

Page 144: Introduction to Econometrics 0

price returns of the Templeton global income, (LNMPTGI) closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (13).

LogL: LNMPTGIMethod: Maximum Likelihood (Marquardt)Date: 11/02/13 Time: 18:54Sample: 1 143Included observations: 143Evaluation order: By equationInitial Values: C(1)=-0.43436, C(2)=0.56828, C(3)=-0.43256, C(4)=0.29258,C(5)=0.00365Convergence achieved after 6 iteration

Coefficient Std. Error z-Statistic Prob.

C(1) -0.434360 0.314452 -1.381323 0.2456C(2) 0.568276 0.258137 2.201451 0.3478C(3) -0.432558 0.288724 -1.498171 0.1204C(4) 0.292578 0.246979 1.184627 0.0626C(5) 0.003647 0.090119 0.404687 0.4574

Log likelihood -286.3648    Akaike info criterion 3.935171Avg. log likelihood -2.002551    Schwarz criterion 3.831575Number of Coefs. 5    Hannan-Quinn criter. 3.893075

Source: Author’s calculation based on EViews 6 software.Significant p-value at the 5% significance level.

By applying equation (13) to Table 12 we have the following results.

Logl=−0 .43 + 0.57 − 0 .43 +0 .29 +0 . 00 (-1 .38 ) (2 .20 ) ( -1 . 50 ) (1. 18) (0 .40)

According to Table 12, the vector coefficients of the C(1), C(2), C(3), C(4) and C(5) are not statistically significant at the 5% significance level of the Templeton global income, (LNMPTGI) closed-end fund. The displayed p-values are greater than the 5% significance level. The log likelihood estimation is -286.36. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 3.94, 3.83 and 3.89 respectively.

Graph 4 shows the gradients of the log likelihood estimation of the ARMA model specified as ARMA (2,2) and SMA (12) of the natural logarithmic monthly market price returns of the Templeton global income, (LNMPTGI) closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations

144

Page 145: Introduction to Econometrics 0

are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (13).

-150

-100

-50

0

50

100

25 50 75 100 125

C(1)

-150

-100

-50

0

50

100

25 50 75 100 125

C(2)

-150

-100

-50

0

50

100

25 50 75 100 125

C(3)

-150

-100

-50

0

50

100

25 50 75 100 125

C(4)

-500

0

500

1,000

1,500

2,000

2,500

25 50 75 100 125

C(5)

Gradients at estimated parameters

Source: Author’s calculation based on EViews 6 software.

According to graph 4, there are outlier values and significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. For example, at the 102 observation, the C(1) has a value of 67.47 and the

145

Page 146: Introduction to Econometrics 0

coefficient vector C(5) has a value of 596.31. The previous value for C(5) at observation 102 was 43.22. The gradients of the log likelihood function have been computed at the estimated parameters and the unsuccessful estimation are displayed through the outliers.

Table 13 shows the results of the analytic derivatives of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Templeton global income, (LNMPTGI) closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (13).

Log-Likelihood derivative testingLogL: LOGL04Two-sided accurate numeric derivativesEvaluated at estimated parameters

Coefficient User Rel. Step Min. Step Coef. ValueC(1) GRAD21 1.49E-08 1.00E-10 -0.434360C(2) GRAD22 1.49E-08 1.00E-10 0.568276C(3) GRAD23 1.49E-08 1.00E-10 -0.432558C(4) GRAD24 1.49E-08 1.00E-10 0.292578C(5) GRAD25 1.49E-08 1.00E-10 0.003647

Source: Author’s calculation based on EViews 6 software.

According to Table 13 and due to the fact that the logl specification contains a @param, the analytic derivatives were calculated based on the specified values. The real and minimum step sizes are identical for all coefficients vectors and very close to zero. We have used one-sided numeric derivative.

146

Page 147: Introduction to Econometrics 0

Table 14 shows the results of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Van Kampen income trust, (LNMPVKI), closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (9).LogL: LNMPVKIMethod: Maximum Likelihood (Marquardt)Date: 11/02/13 Time: 18:57Sample: 1 143Included observations: 143Evaluation order: By equationInitial Values: C(1)=-0.47169, C(2)=0.52542, C(3)=-0.47372, C(4)=0.00327,C(5)=0.41519Convergence achieved after 6 iteration

Coefficient Std. Error z-Statistic Prob.

C(1) -0.471685 0.476369 -0.990167 0.6521C(2) 0.525423 0.219432 2.394468 0.0546C(3) -0.473724 0.483821 -0.979130 0.4983C(4) 0.003266 0.223169 0.014634 0.2123C(5) 0.415188 0.090054 4.610433 0.0244

Log likelihood -269.2215    Akaike info criterion 3.695406Avg. log likelihood -1.882668    Schwarz criterion 3.591810Number of Coefs. 5    Hannan-Quinn criter. 3.653310

Source: Author’s calculation based on EViews software.Significant p-value at the 5% significance level.

By applying equation (9) to Table 14 we have the following results.

Logl=−0 . 47 +0 .53 -0 . 47 +0. 00 + 0 . 42 (-0 .99) (2. 39) (-0 . 98) (0. 01 ) (4 . 61)

According to Table 14, the vector coefficients of the C(1), C(3), and C(4) are not statistically significant at the 5% significance level of the Van Kampen income trust, (LNMPVKI), closed-end fund. The coefficients C(2) and C(5) are statistically significant. The log likelihood estimation is -269.22. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 3.70, 3.59 and 3.65 respectively.

147

Page 148: Introduction to Econometrics 0

Graph 5 shows the gradients of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Van Kampen income trust, (LNMPVKI), closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (9).

148

Page 149: Introduction to Econometrics 0

-120

-80

-40

0

40

80

25 50 75 100 125

C(1)

-120

-80

-40

0

40

80

25 50 75 100 125

C(2)

-120

-80

-40

0

40

80

25 50 75 100 125

C(3)

-500

0

500

1,000

1,500

2,000

2,500

25 50 75 100 125

C(4)

-100

-50

0

50

100

25 50 75 100 125

C(5)

Gradients at estimated parameters

Source: Author’s calculation based on EViews 6 software.

According to graph 5, there are outlier values and significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. For example, at the 13 observation, the C(1) has a value of 77.80 and the coefficient vector C(5) has a value of 86.48. The previous value for C(5) at observation 12 was -30.56. The gradients of the log likelihood function have been

149

Page 150: Introduction to Econometrics 0

computed at the estimated parameters and the unsuccessful estimation are displayed through the outliers.

Table 15 shows the results of the analytic derivatives of the log likelihood estimation of the ARMA model specified as ARMA(2,2) and SMA(12) of the natural logarithmic monthly market price returns of the Van Kampen income trust, (LNMPVKI), closed-end fund. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market price returns observations are 143. The data was obtained from Thomson Financial Investment View database. The ARMA equation and log likelihood specification are based on equation (9).

Log-Likelihood derivative testingLogL: LOGL05Two-sided accurate numeric derivativesEvaluated at estimated parameters

Coefficient User Rel. Step Min. Step Coef. ValueC(1) GRAD21 1.49E-08 1.00E-10 -0.471685C(2) GRAD22 1.49E-08 1.00E-10 0.525423C(3) GRAD23 1.49E-08 1.00E-10 -0.473724C(4) GRAD24 1.49E-08 1.00E-10 0.003266C(5) GRAD25 1.49E-08 1.00E-10 0.415188

Source: Author’s calculation based on EViews 6 software.

According to Table 15 and due to the fact that the logl specification contains a @param, the analytic derivatives were calculated based on the specified values. The real and minimum step sizes are identical for all coefficients vectors and very close to zero. We have used one-sided numeric derivative.

Section 3 summarizes and concludes.

In this article, we have tested an autoregressive moving average, ARMA(2,2) model of order AR(1), AR(2), MA(1), MA(2) and SMA(12) to test the natural logarithmic monthly market returns of the of the closed – end funds of major investment banks such as Van Kampen income trust, Aberdeen Asia Pacific income fund, Credit Suisse asset management income, Scudder high income and Templeton global income. The article aim to compare and select the maximum value of the log likelihood estimation with minimum error value of the different closed-end funds. The whole dataset is from 31/01/1990 to 31/12/2001. The total observations are 144 and the logarithmic monthly market returns observations are 143.The data was obtained from Thomson Financial Investment View database.

By applying Dickey Fuller test, we have found that the natural logarithmic monthly market price returns of all closed-end funds are a stationary series. Specifically, the coefficients of the vectors C(1), C(2), C(3) and C(4) of the Aberdeen Asia Pacific income fund, (LNMPAAP) closed-end fund are statistically significant at the 5% significance level. The displayed p-values are less than the 5% significance level. The vector coefficient C(5) is 0.02 and the p-value is 0.83. The log likelihood estimation is

150

Page 151: Introduction to Econometrics 0

-215.08. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 2.94, 2.83 and 2.90 respectively.

The coefficient of the vector C(2) of the Credit Suisse asset management income, (LNMPCS), closed-end fund is not statistically significant. The coefficients C(1), C(3), C(4) and C(5) are statistically significant at the 5% significance level. The displayed p-values are smaller than the 5% significance level. The log likelihood estimation is -148.96. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 2.15, 2.26 and 2.20 respectively.

The vector coefficients of the C(1), C(2), C(3), C(4) are statistically significant at the 5% significance level of Scudder high income, (LNMPSHI) closed-end fund. The displayed p-values are less than the 5% significance level. The C(5) coefficient is not statistically significant. The coefficient of the C(5) is not statistically significant as the p-value is 0.09. The log likelihood estimation is -123.16. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 1.65, 1.55 and 1.61 respectively.

The vector coefficients of the C(1), C(2), C(3), C(4) and C(5) are not statistically significant at the 5% significance level of the Templeton global income, (LNMPTGI) closed-end fund. The displayed p-values are greater than the 5% significance level. The log likelihood estimation is -286.36. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 3.94, 3.83 and 3.89 respectively.

The vector coefficients of the C(1), C(3), and C(4) are not statistically significant at the 5% significance level of the Van Kampen income trust, (LNMPVKI), closed-end fund. The coefficients C(2) and C(5) are statistically significant. The log likelihood estimation is -269.22. The Akaike information criterion, the Schwarz criterion and the Hannan – Quinn criterion are 3.70, 3.59 and 3.65 respectively.

In terms of gradients at the estimated parameters, we have found that there are outlier values and significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. The gradients of the log likelihood function have been computed at the estimated parameters and the unsuccessful estimation are displayed through the outliers of the peak of the graph. Credit Suisse asset management income, (LNMPCS), closed-end fund displayed less outlier values and not significant fluctuations at the various observations of the coefficients vectors, C(1), C(2), C(3), C(4) and C(5) of the gradients. Finally, the analytic derivatives were calculated based on the specified values. The real and minimum step sizes are identical for all coefficients vectors and very close to zero. We have used one-sided numeric derivative.

We have concluded that the best fit model is Scudder high income, (LNMPSHI), closed-end fund as it has the maximum log likelihood value of -123.16 and the lowest value of Akaike information criterion, the Schwarz criterion, and the Hannan – Quinn criterion. Their values are 1.65, 1.55 an 1.61 respectively

References

151

Page 152: Introduction to Econometrics 0

Andrews, D,W.K.,(1991), “Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation”. Econometrica, 59, pp.817 – 858.

Bhargava, A., (1986), “On the Theory of Testing for Unit Roots in Observed Time Series”. Review of Economic Studies, 53, pp.369 -384.

Box, G,E,P, and Gwilym, M, J, (1976), “ Time Series Analysis: Forecasting and Control, Revised Edition, Oakland, CA: Holden – Day.

Davidson, R and MacKinnon, J,G., (1993), “ Estimation and Inference in Econometrics, Oxford: Oxford University Press.

Dickey, D.A and Fuller, W.A, (1979), “ Distribution of the Estimators for Autoregressive Time Series with a Unit Root”. Journal of the American Statistical Association, 74, pp.427 -431.

Elliott, G, Thomas, J., Rothenberg and James Stock, (1996), “ Efficient Tests for an Autoregressive Unit Root”. Econometrica 64, pp.813 – 836.

EViews User’s Guide II,(2007), “Quantitative Micro Software”. pp.71-72, 284, 289 645.

Fisher, R.A., (1932), “ Statistical Methods for Research Workers, 4th Edition, Edinburgh: Oliver & Boyd.

Greene, William H.,(1997), “Econometric Analysis”. 3rd Edition. Upper Saddle River, NJ: Prentice Hall.

Hamilton,J.D.,(1994a), “ Time Series Analysis”, Princeton University Press.

Hayashi, F., (2000), “Econometrcis, Princeton, NJ: Princeton University Press.

Johnston, J and Enrico, J., Dinardo, (1997), “ Econometric Methods, 4 th Edition, New York: McGraw-Hill.

Kwiatkowski, D, Phillips, P.C.B and Peter Scmidt and Yongcheol Shin, (1992), “Testing the Null Hypothesis of Stationary against the Alternative of a Unit root”. Journal of Econometrics, 54, pp.159 – 178.

Maddala, G.S and Wu, S., (1999), “A Comparative Study of Unit Root Tests with Panel Data and a New Simple Test”. Oxford Bulletin of Economics and Statistics, 61, pp.631 -52.

Newey, Whitney and Kenneth West, (1994), “ Automatic Lag Selection in Covariance Matrix Estimation”. Review of Economic Studies, 61, pp.631 – 653.

Serena, Ng and Perron, P, (2001), “ Lag Length Selection and the Construction of Unit Root Tests with Good size and Power”. Econometrica, 69 (6), pp.1519-1554.

152

Page 153: Introduction to Econometrics 0

Philipps, P.C.B and Perron, P., (1988), “ Testing for a Unit Root in Time Series Regression”. Biometrika, 75, pp.335 -346.

Rao, P., and Griliches, (1969), “ Small Sample Properties of Several Two-Stage Regression Methods in the Context of Auto-Correlated Errors”. Journal of the American Statistical Association, 64, pp.253 – 272.

Said, Said, E and David, A. Dickey, (1984), “Testing for Unit Roots in Autoregressive Moving Average Models of Unknown Order”. Biometrika, 71, pp.599 – 607.

Vuong, Q.H., (1989), “Likelihood Ratio Tests for Model Selection and Non – Nested Hypotheses”, Econometrica, 57(2), pp. 307 – 333.

There is another method to calculate the log likelihood based from the actual data. You don’t have to create a series of statements. The log likelihood maximum value is created from the data in the sample. It creates a new log likelihood object, then, arrange the statements into the logl specification view and EViews calculate the maximum value. Insert your data in EViews, then select the binary method. Click on logit binary estimation method and press OK. You will get the following output.I have attached a relevant example to facilitate your learning experience.

153

Page 154: Introduction to Econometrics 0

Please check the values of the coefficient in relation to the standard errors, the z-statistic and the related probabilities. At the bottom of the table, check the log likelihood function in relation to the likelihood ratio, (LR) statistic and the probability of the likelihood ratio statistic. You have to use the above statistics to compare two or more models. You select the model that has significant probability of likelihood ratio statistic of less than 5% significant level. The likelihood ratio statistic is calculated as follows:

Likelihood ratio statistic λ=2∗(restricted log likelihood - log likelihood )=3. 976The null hypothesis is that the coefficients are jointly equal to zero. This statistic has a χ2

distribution with (k-1) degrees of freedom.

Check the unrestricted log likelihood value as it includes the regressors and the intercept. The restricted log likelihood takes into consideration only the intercept and excludes the regressors. The z-statistic is calculated as the coefficient divided by the standard error.

Example of the maximum log likelihood in EViews 6 related to two equity hedge fund returns in relation to their lockup periods

Dependent Variable: Equity hedge fund returns, (galleon healthcare offshore)Method: ML - Binary LogitDate: 01/12/16 Time: 04:50Sample: 1 1000Included observations: 1000Convergence achieved after 4 iterationsCovariance matrix computed using second derivatives

Variable Coefficient Std. Error z-Statistic Prob.

C -0.698420 0.179525 -3.890375 0.0001X1 -0.332125 0.233139 -1.424578 0.1543X2 0.329809 0.234682 1.405342 0.1599

Mean dependent var 0.332000    S.D. dependent var 0.471167S.E. of regression 0.470733    Akaike info criterion 1.273196Sum squared resid 220.9247    Schwarz criterion 1.287919Log likelihood -633.5980    Hannan-Quinn criter. 1.278792Restr. log likelihood -635.5860    Avg. log likelihood -0.633598LR statistic 3.976024    McFadden R-squared 0.003128Prob(LR statistic) 0.136967

Obs with Dep=0 668     Total obs 1000Obs with Dep=1 332

In the following example, the likelihood ratio statistic of 15.20 is highly significant as the p – value are close to zero and less than 5 % significance level.

Dependent Variable: Equity hedge fund returns, (Seligman tech spectrum)Method: ML - Binary LogitDate: 01/12/16 Time: 04:50Sample: 1 1000

154

Page 155: Introduction to Econometrics 0

Included observations: 1000Convergence achieved after 3 iterationsCovariance matrix computed using second derivatives

Variable Coefficient Std. Error z-Statistic Prob.

C -1.083560 0.183033 -5.920037 0.0000X1 0.904669 0.234419 3.859202 0.0001X2 -0.067858 0.233728 -0.290328 0.7716

Mean dependent var 0.344000    S.D. dependent var 0.475279S.E. of regression 0.472078    Akaike info criterion 1.278103Sum squared resid 222.1889    Schwarz criterion 1.292826Log likelihood -636.0514    Hannan-Quinn criter. 1.283699Restr. log likelihood -643.6531    Avg. log likelihood -0.636051LR statistic 15.20340    McFadden R-squared 0.011810Prob(LR statistic) 0.000500

Obs with Dep=0 656     Total obs 1000Obs with Dep=1 344

The Wald and Langrangian multiplier tests

Both tests are based on the maximum likelihood method of large samples. The statistics of both tests are used in the linear unrestricted regression and referred as Wald coefficient restrictions.

The mathematical formula for the Wald test is as follows:

W = nr2

1−r2

The mathematical formula for the Langragian multiplier test statistic is as follows:

LM = nr2

Let’s take as an example a financial model related to the returns of a share prices in relation to market returns. The dependent variable is share price returns and the independent variable is market price returns.

Number ofobservations

Share price returns

Market price returns

155

Page 156: Introduction to Econometrics 0

1 3.526787 8.732092 -4.34533 -5.198153 5.222709 6.218654 -4.99619 -5.53935 -3.04336 7.698086 -2.375422 -4.997357 2.651303 5.427778 -0.68924 -1.54249 0.205664 1.463910 2.4783 3.652811 0.237407 -0.149412 0.329728 0.1668813 -0.26869 -0.144414 0.064769 0.09787315 -0.5873 -0.0991116 0.329225 -0.0834417 -0.11849 0.12276718 0.011541 -0.4576719 -0.18757 -0.5304620 -0.38752 -0.1111821 -0.26835 -0.2894722 0.262798 -0.1767623 0.355054 -1.1568624 -1.34302 -0.577125 -0.77964 0.57818226 -0.04649 -0.0533127 0.098381 -0.2305428 -0.09585 -0.6662529 -0.0059 -0.5007130 -0.05415 -0.53128

Run the unrestricted regression equation in EViews based on the data provided. In the specification box, input the following linear equation:

share c market

Dependent Variable: SHAREMethod: Least Squares

156

Page 157: Introduction to Econometrics 0

Date: 02/24/16 Time: 16:40Sample: 1 30Included observations: 30

Variable Coefficient Std. Error t-Statistic Prob.

C -0.284046 0.271224 -1.047275 0.3039MARKET 0.422744 0.084654 4.993807 0.0000

R-squared 0.471080    Mean dependent var -0.127295Adjusted R-squared 0.452191    S.D. dependent var 1.993636S.E. of regression 1.475573    Akaike info criterion 3.680310Sum squared resid 60.96484    Schwarz criterion 3.773723Log likelihood -53.20465    Hannan-Quinn criter. 3.710194F-statistic 24.93811    Durbin-Watson stat 1.868822Prob(F-statistic) 0.000028

Then, press view, select coefficient tests and then Wald coefficient – restrictions. Set the second coefficient as C(2) = 0 and test the null hypothesis that the market coefficient is equal to zero. In the Wald test coefficient box input C(2) = 0. The hypothesis testing is as follows:

H0 = C(2) = 0H1 = C(2) ≠ 0

Wald Test:Equation: EQ01

Test Statistic Value df Probability

F-statistic 24.93811 (1, 28) 0.0000Chi-square 24.93811 1 0.0000

Null Hypothesis Summary:

Normalized Restriction (= 0) Value Std. Err.

C(2) 0.422744 0.084654

Restrictions are linear in coefficients.

The value and the probability value of the Chi-square statistic is significant and we reject the null hypothesis that C(2) = 0. Please e-mail me if you have any problem with the Wald test statistic. The F-statistic is significant and we reject the null hypothesis and we accept the alternative one.

Stochastic regressors or independent variables

In econometrics, the independent variables x1, x2 , etc…. are random variables and not constants. This is the definition that the regressors are stochastic. Sometimes, it is needed to make assumptions concerning the joint distribution of the variables y and x.

157

Page 158: Introduction to Econometrics 0

Outliers

Outliers are extreme numerical values that affect negatively the regression parameters. They could be found by observing the residuals from the estimated regression equation. This problem is solved by removing the extreme numerical values from our dataset and run again the regression with the remaining values. The outliers affect the coefficients, the standard errors and the r2.

Let’s consider as an example the total amount of revenues and expenditures expressed in thousands for an investment bank. The dependent variable is revenues and the independent variable is expenditures.

Observations Revenues, (000), Euros Expenditures, (000), Euros1 200 1802 210 1703 220 1304 230 5005 240 6006 250 7007 350 3108 410 4059 420 41010 425 42011 430 42012 440 40013 450 41014 460 43015 470 45016 480 46017 490 49018 500 49519 510 50020 520 505

RESIDUAL OUTPUT

Observation

Predicted Revenues, (000), Euros

Residuals

1 315.2577 -115.2582 312.3322 -102.332

158

Page 159: Introduction to Econometrics 0

3 300.6303 -80.63034 408.8733 -178.8735 438.1282 -198.1286 467.3831 -217.3837 353.289 -3.289058 381.0812 28.918829 382.5439 37.45608

10 385.4694 39.5305911 385.4694 44.5305912 379.6184 60.3815613 382.5439 67.4560814 388.3949 71.605115 394.2459 75.7541316 397.1714 82.8286417 405.9478 84.0521818 407.4106 92.5894319 408.8733 101.126720 410.3361 109.6639

159

Page 160: Introduction to Econometrics 0

The large negative residuals for the first six observations are outliers. The regression output in Excel is as follows:

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.352089

R Square 0.123966

Adjusted R Square 0.075298

Standard Error 110.1444

Observations 20

ANOVA

df SS MS FSignificanc

e F

Regression 1 30901.5630901.5

62.54715

6 0.1279

Residual 18 218372.212131.7

9

Total 19 249273.8

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Intercept 262.5989 80.700083.25401

10.00440

6 93.05423432.143

6Expenditures, (000), Euros 0.292549 0.183303

1.595981 0.1279 -0.09256

0.677655

The outliers are affecting negatively the coefficients, the standard errors and r2. The regression equation is estimated again by dropping the first six observations.

The current regression equation is y=262. 60+0 . 29 x t+εt r2 = 0.12 (80.70) (0.18)

The t-statistic of the independent variable expenditures is not significant as it is 1.6.

The following table shows the current data and the regression equation that was re-estimated after dropping the first six observations. Please check the r2, the coefficients of the parameters α and β , the standard errors and the t-statistics.

160

Page 161: Introduction to Econometrics 0

Revenues, (000), Euros

Expenditures, (000), Euros

350 310410 405420 410425 420430 420440 400450 410460 430470 450480 460490 490500 495510 500520 505

SUMMARY OUTPUT

161

Page 162: Introduction to Econometrics 0

Regression Statistics

Multiple R 0.967499

R Square 0.936055

Adjusted R Square 0.930726

Standard Error 12.08534

Observations 14

ANOVA

df SS MS FSignificance

FRegression 1 25656.26 25656.26 175.6612 1.59E-08Residual 12 1752.664 146.0554Total 13 27408.93

Coefficients Standard Error t Stat P-value Lower 95%Upper 95%

Intercept 86.09791 27.94032 3.081493 0.009511 25.22117 146.9746Expenditures, (000), Euros 0.84351 0.063643 13.25373 1.59E-08 0.704843 0.982177

The current regression equation is y=86 .10+0 .84 x t +ε t r2 = 0.94 (27.94) (0.06)

The t-statistic of the independent variable expenditures is significant as it is 13.25.

The inverse regression method exercise

It is given the following data related to the share prices of bank industry in relation to the FTSE index prices. The share prices are nominated in pounds.

162

Page 163: Introduction to Econometrics 0

Share prices, (y) FTSE prices, (x)10.45 70009.21 67008.45 63007.89 62006.45 61005.27 59004.78 58003.36 57502.87 56002.23 5500

Calculate the regression equation and the reverse regression equation without using Excel.

We are going to use a diiferent equation from the one that I have mentioned in the handout to calculate the β coefficient. The following equation is the one that we have covered in the handout.

β=n∑ xy−∑ x∑ y

n∑ x2−(∑ x )2

We are going to use the following one that gets the same result as the one that we have covered in the handout. The eqution of the α coefficient is the same.

β=S xy

S xx=∑ x i y i−n x y

∑ x i2−n x2

First of all, we calculate x , { y , Sxx , Sxy , Syy , β , { α ¿¿ .

Observations

Share prices, (y) FTSE prices, (x) x2 y2 xy

163

Page 164: Introduction to Econometrics 0

110.45 7000

49000000109.2025 73150

2 9.21 6700 44890000 84.8241 617073 8.45 6300 39690000 71.4025 532354 7.89 6200 38440000 62.2521 489185 6.45 6100 37210000 41.6025 393456 5.27 5900 34810000 27.7729 310937 4.78 5800 33640000 22.8484 277248 3.36 5750 33062500 11.2896 193209 2.87 5600 31360000 8.2369 1607210 2.23 5500 30250000 4.9729 12265

Total 60.96 60850372352500

444.4044

382829

Solution

x=∑ xn

=6085010

=6085

y=∑ yn

=60 . 9610

=6 . 096

Sxx = ∑ x i2−n x2=372352500−10(37027225 )=2080250

Sxy=∑ xi y i−n( x )( y )=382829−10(6085 )(6 . 096)= 11887 .4

S yy=∑ y i2− n { y

2

=444 . 4044- 10 (37. 161216 )= 72. 79224 ¿

β=S xy

S xx=11887 .4

2080250=0 . 0057

144

α= y− β x=6 .096−0 .0057144∗6085=−28 .676

The regression equation is as follows:

y= -28 . 676+0 .0057 x

You can insert the data in Excel and check if your calculations are corrected. You can also use the fx-83MS Casio calculator.

164

Page 165: Introduction to Econometrics 0

The inverse regression equation formula is as follows:

x=α '+ β' y (1)

The coefficients of the regression equation are as follows:

β '=S xy

S yy=11887 .4

72 .79224=163 .3059 (2)

α '= x− β ' y=6085−163 . 3059∗6 .096=5089. 487 (3 )

We substitute the numerical values of equations (2) and (3) into (1) and we have the following results:

x=5089 .487+163 . 3059 y

In Excel, you will get the following output.

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.966021

R Square 0.933197Adjusted R Square 0.924846

Standard Error 131.799

Observations 10

ANOVA

df SS MS FSignificance

F

Regression 1 1941282 1941282 111.7543 5.6E-06

Residual 8 138967.8 17370.98

Total 9 2080250

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Lower 95.0%

Intercept 5089.487 102.9814 49.42143 3.11E-11 4852.012 5326.963 4852.012Coefficient 163.3059 15.4479 10.57139 5.6E-06 127.6829 198.9288 127.6829

Exercise

165

Page 166: Introduction to Econometrics 0

Please consider the following two variables x and y related to sales and advertising expressed in thousands USD.

x y2 14 77 49 210 93 86 41 55 2

Calculate the estimated coefficients, the regression equation, the standard errors, the t-statistics, the R squared, the sum of squared residuals, the F-statistic, and the standard errors of the regression equation.

Run the regression equation in Excel or in EViews. In addition, use the descriptive function in Excel to get the mean and the related statistics.

Solution

x=5 . 22y=4 .67R -squared = 0. 019Explained variation, RSS=12367647Unexp lained variation, SSE = 62.763235Total var iation , SST=64000000 . 0y= 3 . 9985+0 . 1279 x+εt

The t-statistics are (1 . 94 ) (0 .37 ) The standard errors are (2.057 ) (0 .344 ) F-statistic 0 . 1379

Please perform the above exercise in Excel and check the results.

166

Page 167: Introduction to Econometrics 0

Exercise

Consider the following dataset that represents incomes and expenditures of U.K families located in the South West of England.

Incomes(y) Expenditures (x)5000 2000

7000 10008000 40009000 500010000 300011000 900012000 1000015000 1100017000 13000

Calculate the following items:(a) Regress incomes (y) on expenditures (x) in Excel. The dependent variable is incomes and the independent variable (x) is expenditures.(b) Calculate the estimated coefficients, the standard error, the t-statistics of the regression equation, the R squared, the sum of squared residuals, the F-statistic, and the standard errors of the regression equation.( c) Calculate the residuals to find if there are outliers. If yes, what action would you take?

Solution

(a)

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.926414663

R Square 0.858244128Adjusted R Square 0.837993289

Standard Error 1534.143897Observations 9

ANOVA

df SS MS FSignificanc

e F

Regression 1 997470409974704

042.3806

7 0.000331Residual 7 16475182 2353597Total 8 116222222

167

Page 168: Introduction to Econometrics 0

CoefficientsStandard

Error t Stat P-value Lower 95%Upper 95%

Intercept 5227.737226 950.60192 5.4993970.00090

7 2979.9227475.55

2

Expenditures (x) 0.809489051 0.1243446 6.5100440.00033

1 0.5154611.10351

7

The regression equation is as follows:

y= 5227 . 74+0. 809 x The t-statistics are (5 .499 ) (6 .51 ) The standard errors are (950 . 60) (0 . 124 )

(b)

x=6444 . 4y=10444 . 4R -squared = 0. 858The standard error of the regression { σ=1534 .14

¿ Explained variation or sum of squares due to regression, ESS=99747040 ¿Unexp lained variation or residual sum of squares, RSS = 16475182 ¿Total var iation , TSS=116222222 ¿ y= 5227 .74+0 .809 x ¿ The t-statistics are (5 . 499 ) (6 .51 ) ¿ The standard errors are (950 .60 ) (0 . 124 ) ¿ F-statistic 42 .38 ¿¿

(c)

By taking residuals outside ± 2 standard deviations as outliers, then, the two outliers that we have spot are the first and the fifth observations. The first observation is x = 2000 and y = 5000. The fifth observation is x = 3000 and y = 1000.

168

Page 169: Introduction to Econometrics 0

Exercise

You are given the following data related to the number of persons and their daily incomes expressed in pounds.

Number of persons, (x)

Daily incomes, (y)

43 63.0032 54.3032 51.0030 39.0026 52.0025 55.0023 41.2022 47.7022 44.50

21 43.0020 46.80

169

RESIDUAL OUTPUT

Observation Predicted Incomes (y) Residuals1 6846.715328 -1846.71532 6037.226277 962.773723 8465.693431 -465.693434 9275.182482 -275.182485 7656.20438 2343.79566 12513.13869 -1513.13877 13322.62774 -1322.62778 14132.11679 867.883219 15751.09489 1248.9051

Page 170: Introduction to Econometrics 0

20 42.4019 56.5019 55.0019 53.00

18 55.0018 45.00

17 50.7017 37.50

You are required to calculate the following:

(a) Regress y on x in Excel and show the regression equation in terms of

y=α+βx+ε t . In addition, display in parentheses under the coefficients the t-statistics and the standard errors.

(b) Construct a 95% confidence interval for β .

(c) Test the null hypothesis at the 5% significance level. H 0=β=0 and H1=β≠0 .

(a)

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.386547

R Square 0.149419

Adjusted R Square 0.099385

Standard Error 6.465665

Observations 19

ANOVA

df SS MS FSignificanc

e F

Regression 1 124.8433124.843

32.98633

8 0.102089

Residual 17 710.68241.8048

2

Total 18 835.5253

170

Page 171: Introduction to Econometrics 0

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Intercept 39.96495 5.4815497.29081

31.26E-

06 28.3998751.5300

2Number of persons, (x) 0.39112 0.226329

1.728102

0.102089 -0.08639

0.868633

The regression equation is as follows:

y= 39 .96+0 .39 x The t-statistics are (7 . 29) (1. 728) The standard errors are (5 . 48) (0.226 )

(b) The 95% confidence interval for the coefficient β is as follows:

The total number of the data set that we use is n = 20. Thus, we use the t-distribution.The formula for degrees of freedom are (n-2) or d.f = 20 – 2 = 18. The t-value for 18 d.f with 5% significance level is 2.101.

The confidence interval formula for the coefficient β is as follows:

β±t−value∗(s tan dard error β )

0.39112 ±2 .101∗¿ ¿0.226329 = (0.8658, -0.0838)

The upper confidence level is 0.391 + 2.101*0.226 = 0.87 (to 2.d.p.).The lower confidence level is 0.391 – 2.101*0.226 = -0.08 (to 2.d.p.).

(d) Test the null hypothesis at the 5% significance level. H 0=β=0 and H1=β≠0 .

The value of t he coefficient β is not zero at the 95% confidence level and therefore we reject the hypothesis that β=0 at the 5% significance level . We acceptthe alternative hypothesis .

171

Page 172: Introduction to Econometrics 0

Exercise

It is given the following data related to the share prices of car industry in relation to the deutsche index DAX prices. The share prices are expressed in euros.

Share prices, (y) DAX prices, (x)10.45 70009.21 67008.45 63007.89 62006.45 61005.27 59004.78 58003.36 57502.87 56002.23 5500

a) Calculate the regression equation of the share prices expressed as the dependent variable y in relation to the independent variable DAX prices, x, in Excel. Plot the regression equation.

b) What is the expected sign of the coefficient of the independent variable x?

c) Was the sign of the coefficients the expected one?

a)

SUMMARY OUTPUT

Regression StatisticsMultiple R 0.966021R Square 0.933197Adjusted R Square 0.924846Standard Error 0.779645Observations 10

ANOVA

df SS MS FSignificanc

e F

Regression 1 67.9294767.9294

7111.754

3 5.6E-06

Residual 8 4.8627710.60784

6Total 9 72.79224

Coefficient Standard t Stat P-value Lower 95% Upper

172

Page 173: Introduction to Econometrics 0

s Error 95%

Intercept -28.6762 3.2985 -8.6937 0.0000 -36.2825-

21.0698FTSE prices, (x) 0.0057 0.0005 10.5714 0.0000 0.0045 0.0070

The regression equation is as follows:

y= -28 .6762+0. 0057 x The t-statistics are (-8 .69 ) (10 .57 ) The standard errors are (3 .2985 ) (0 .0005)

b) What is the expected sign of the coefficient of the independent variable x?The expected sign of the coefficient β of the independent variable DAX prices is positive.

c) Was the sign of the coefficient the expected one?

Yes, it is because as the share prices increases the DAX index increases.

Exercise

Consider two variables data x and y that are gathered from two statisticians. The data are related to the number of futures contract sold.

Statistician 1y x

8 57 46 35 24 93 82 75 61 57 2

Statistician 2y x

9 9

173

Page 174: Introduction to Econometrics 0

8 86 74 63 52 45 32 26 88 1

a) Calculate the regression equations from the data gathered from the two statisticians.

b) Is there any differences in the standard errors β for the two datasets?

a) The output for statistician 1

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.51

R Square 0.26Adjusted R Square 0.17

Standard Error 2.09

Observations 10

ANOVA

df SS MS FSignificanc

e F

Regression 1 12.5829912.5829

92.87471

4 0.128425

Residual 8 35.017014.37712

7Total 9 47.6

174

Page 175: Introduction to Econometrics 0

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Intercept 7.287335 1.6093074.52824

40.00192

9 3.57626410.9984

1

x -0.48771 0.287652 -1.69550.12842

5 -1.151040.17561

3

The regression equation is as follows:

y= 7 .287 − 0 . 4877 xs tan dard errors (1 .61 ) (0 . 29)

The output for statistician 2

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.45

R Square 0.20Adjusted R Square 0.10

Standard Error 2.41

Observations 10.00

ANOVA

df SS MS FSignificanc

e F

Regression 1 11.5948611.5948

61.99459

4 0.195558

Residual 8 46.505145.81314

2Total 9 58.1

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Intercept 3.113069 1.7260151.80361

60.10894

9 -0.86713 7.09327

x 0.412628 0.2921681.41230

10.19555

8 -0.261111.08636

8

The regression equation is as follows:

y= 3 .11 + 0 .41 xs tan dard errors (1 .73 ) (0 . 29)

175

Page 176: Introduction to Econometrics 0

b)

No, there are no differences in the standard errosβ for the two datasets. The standard errors of the regression equation in both datasets are the same. In the first regression

equation, σ 1= 0.29 and in the second regression equation σ 2=0.29 .

Exercise

You are given the following dataset related to the output and the total cost of a company. The output is denoted in units and total cost in pounds.

Output in units, (y) Total cost in pounds, (x)0 121 232 293 344 425 576 717 74

The output in units is the dependent variable and the total cost in pounds is the independent variable.

You are given the following linear equation:

y=α+βx+ε

(a) Calculate the coefficients of this equation. (b) Calculate the 90% confidence interval for the coefficient β .

(a)

SUMMARY OUTPUT

Regression StatisticsMultiple R 0.987786R Square 0.975722Adjusted R Square 0.971675Standard Error 0.412249Observations 8

ANOVA

df SS MS FSignificanc

e F

Regression 1 40.9803 40.9803241.132

6 4.51E-06Residual 6 1.019695 0.16994

176

Page 177: Introduction to Econometrics 0

9Total 7 42

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Intercept -1.07417 0.328654 -3.26840.01706

8 -1.87836 -0.26998Total cost in pounds 0.106998 0.00689

15.52844

4.51E-06 0.090138

0.123859

The regression equation is as follows:

y= -1 . 07 + 0 . 11xs tan dard errors (0 . 33) (0 .01 )

(b)

The total number of the data set that we use is n = 8. Thus, we use the t-distribution.The formula for degrees of freedom are (n-2) or d.f = 8 – 2 = 6. The t-value for 6 d.f with 10% significance level is 1.943.

The confidence interval formula for the coefficient β is as follows:

β±t−value∗(s tan dard error β )

0.106998 ±1.943(0.00689)

The upper confidence level is 0.106998 + 1.943*0.00689 = 0.12 (to 2.d.p.). The lower confidence level is 0.106998 – 1.943*0.00689 = 0.09 (to 2.d.p.).

Please check your results with Excel output concerning the 95% lower and upper confidence level.

Coefficients Standard Error t Stat P-value Lower 95%Upper 95%

Total cost in pounds 0.106998 0.00689 15.52844 4.51E-06 0.090138 0.123859

177

Page 178: Introduction to Econometrics 0

Exercise

You are given the following data:

x=8x=2y=0.6β=0 .2Sxx=50Sxy=16S yy=60n=20

The regression equation is y=4 . 0+0 . 2 x0

Calculate the predicted value of y for x = 2.

y= 4.0 +0.2x = 4.0 + 0.2(2) = 4.4

Calculate the standard error of the prediction.

SE( y )=√ σ2(1+ 1n+( x− x )2

Sxx ) (1)

The residual is calculated as follows:

σ 2=( S yy− β S xy)

n−2=

(60−0. 2∗16)18

=60−3 .218

=56 . 818

=3 . 156 (2)

By substituting equation (2) into equation (1), the standard error will be as follows:

SE( y )=√3 . 156(1+ 120

+(2−8 )2

50 )=√3. 156 (1+0 .05+0 .72 )=2.363 to (3.d . p .) .

178

Page 179: Introduction to Econometrics 0

Exercise

It is given data related to the Danish and Swedish equity funds traded from a major Danish investment bank. You are required to construct a scatterplot of Danish returns versus Swedish returns. Sketch a regression line. Compare the standard errors and t-statistics of both funds and decide whether there are significant at the 95 % confidence level.

Date Danish equity fund Returns DateSwedish equity fund Returns

21/4/2009 180 21/4/2009 12021/5/2009 201.17 11.76 21/5/2009 167.57 39.64

25/5/09 201.03 11.68 25/5/09 164.95 37.4626/5/09 196.43 9.13 26/5/09 162.05 35.0427/5/09 199.2 10.67 27/5/09 165.52 37.9328/5/09 194.26 7.92 28/5/09 164 36.6729/5/09 191.44 6.36 29/5/09 167.31 39.43

2/6/2009 197.86 9.92 2/6/2009 169.3 41.083/6/2009 194.79 8.22 3/6/2009 168.56 40.474/6/2009 195.53 8.63 4/6/2009 165.68 38.078/6/2009 192.44 6.91 8/6/2009 162.75 35.639/6/2009 192.85 7.14 9/6/2009 163.49 36.24

10/6/2009 198.09 10.05 10/6/2009 168.87 40.7311/6/2009 199.54 10.86 11/6/2009 169.74 41.4512/6/2009 201.03 11.68 12/6/2009 171.05 42.5415/6/2009 199.12 10.62 15/6/2009 169.29 41.0816/6/2009 195.75 8.75 16/6/2009 166.55 38.7917/6/2009 192.78 7.10 17/6/2009 165.3 37.7518/6/2009 189.25 5.14 18/6/2009 164.23 36.8619/6/2009 192.34 6.86 19/6/2009 165.45 37.8820/6/2009 192.34 6.86 20/6/2009 165.45 37.8821/6/2009 192.34 6.86 21/6/2009 165.45 37.8822/6/2009 188.51 4.73 22/6/2009 162.96 35.8023/6/2009 184.16 2.31 23/6/2009 160.81 34.0124/6/2009 186.69 3.72 24/6/2009 162.52 35.4325/6/2009 187.24 4.02 25/6/2009 163.84 36.53

7/6/2009 183.26 1.81 7/6/2009 161.22 34.357/7/2009 185.47 3.04 7/7/2009 164.1 36.757/8/2009 181.86 1.03 7/8/2009 162.12 35.107/9/2009 182.8 1.56 7/9/2009 161.88 34.90

7/10/2009 181.94 1.08 7/10/2009 160.03 33.3613/7/2009 180.57 0.32 13/7/2009 158.96 32.4714/7/2009 182.75 1.53 14/7/2009 163.2 36.0015/7/2009 185.07 2.82 15/7/2009 162.16 35.1316/7/2009 187.41 4.12 16/7/2009 168.31 40.2617/7/2009 188.41 4.67 17/7/2009 169.31 41.0918/7/2009 189.41 5.23 18/7/2009 170.31 41.9319/7/2009 190.41 5.78 19/7/2009 171.31 42.7620/7/2009 191.41 6.34 20/7/2009 172.31 43.59

179

Page 180: Introduction to Econometrics 0

7/21/2009 193.42 7.46 7/21/2009 173.75 44.797/22/2009 192.51 6.95 7/22/2009 174.05 45.047/23/2009 193.51 7.51 7/23/2009 175.05 45.887/24/2009 197.9 9.94 7/24/2009 178.37 48.647/25/2009 198.9 10.50 7/25/2009 179.37 49.487/26/2009 199.9 11.06 7/26/2009 180.37 50.317/27/2009 198.67 10.37 7/27/2009 177.39 47.837/28/2009 199.6 10.89 7/28/2009 175.2 46.007/29/2009 197.31 9.62 7/29/2009 175.83 46.537/30/2009 199.26 10.70 7/30/2009 177.38 47.827/31/2009 200.96 11.64 7/31/2009 178.55 48.79

8/1/2009 201.96 12.20 8/1/2009 179.55 49.638/2/2009 202.96 12.76 8/2/2009 180.55 50.468/3/2009 203.96 13.31 8/3/2009 181.55 51.298/4/2009 204.67 13.71 8/4/2009 180.16 50.138/5/2009 205.67 14.26 8/5/2009 181.16 50.978/6/2009 206.67 14.82 8/6/2009 182.16 51.808/7/2009 207.67 15.37 8/7/2009 183.16 52.638/8/2009 208.67 15.93 8/8/2009 184.16 53.478/9/2009 209.67 16.48 8/9/2009 185.16 54.30

8/10/2009 210.67 17.04 8/10/2009 186.16 55.138/11/2009 211.67 17.59 8/11/2009 187.16 55.978/12/2009 212.67 18.15 8/12/2009 188.16 56.808/13/2009 212.4 18.00 8/13/2009 182.02 51.688/14/2009 213.4 18.56 8/14/2009 183.02 52.528/15/2009 214.4 19.11 8/15/2009 184.02 53.358/16/2009 215.4 19.67 8/16/2009 185.02 54.188/17/2009 216.4 20.22 8/17/2009 186.02 55.028/18/2009 217.4 20.78 8/18/2009 187.02 55.858/19/2009 205.7 14.28 8/19/2009 176.74 47.288/20/2009 208.24 15.69 8/20/2009 181.41 51.188/21/2009 212.17 17.87 8/21/2009 182.62 52.188/24/2009 213.17 18.43 8/24/2009 183.62 53.028/25/2009 215.82 19.90 8/25/2009 186.51 55.438/26/2009 217.35 20.75 8/26/2009 186.16 55.138/27/2009 218.35 21.31 8/27/2009 187.16 55.978/28/2009 219.24 21.80 8/28/2009 186.47 55.398/29/2009 220.24 22.36 8/29/2009 187.47 56.238/30/2009 221.24 22.91 8/30/2009 188.47 57.068/31/2009 215.62 19.79 8/31/2009 185.74 54.78

9/1/2009 214.58 19.21 9/1/2009 182.4 52.009/2/2009 215.58 19.77 9/2/2009 183.4 52.839/3/2009 210.4 16.89 9/3/2009 180.51 50.439/4/2009 212.63 18.13 9/4/2009 181.6 51.339/5/2009 213.63 18.68 9/5/2009 182.6 52.179/6/2009 214.63 19.24 9/6/2009 183.6 53.009/7/2009 215.68 19.82 9/7/2009 184.79 53.999/8/2009 216.55 20.31 9/8/2009 185.59 54.669/9/2009 217.57 20.87 9/9/2009 186.79 55.66

9/10/2009 219.4 21.89 9/10/2009 187.68 56.409/11/2009 219.77 22.09 9/11/2009 188.99 57.499/12/2009 220.77 22.65 9/12/2009 189.99 58.339/13/2009 221.77 23.21 9/13/2009 190.99 59.169/14/2009 215.94 19.97 9/14/2009 186.66 55.55

180

Page 181: Introduction to Econometrics 0

9/15/2009 216.47 20.26 9/15/2009 188.22 56.859/16/2009 217.47 20.82 9/16/2009 189.22 57.689/17/2009 222.79 23.77 9/17/2009 190.79 58.99

Exercise

It is given interest rates from different countries that matures in three months. The markets that are included are the United States market, US, the Japanese market rates denoted in euro as, JP, the United Kingdom market rates denotes as, GB, the euro market rates denoted as, EU, the Norway market rates denoted as, NO and the Danish market rates denoted as, DK.

Euro market rates, maturity 3 monthsUS 3M JP 3M GB 3M EU 3M NO 3M DK 3M

Date Value Value Value Value Value Value1/2/2009 1.71 0.96 2.29 2.83 3.2 4.21/5/2009 1.13 1.29 2.07 3.04 3.51 4.351/7/2009 1.5 1.13 2 2.94 3.55 4.11/8/2009 1.15 0.62 1.74 2.7 3.62 4.11/9/2009 1.22 0.74 1.96 2.53 3.55 4

1/12/2009 1.21 0.82 2.14 2.54 3.41 41/13/2009 1.23 0.72 2.1 2.31 3.35 3.91/14/2009 1.15 0.71 2.09 2.55 3.24 3.81/15/2009 1.2 0.6 1.89 2.31 3.35 3.81/16/2009 1.17 0.54 2.08 2.29 3.31 3.61/19/2009 1.24 0.66 1.86 2.31 3.45 3.51/20/2009 1.02 0.53 1.88 2.35 3.41 3.61/21/2009 1.35 0.82 1.94 1.97 3.09 3.451/22/2009 1.22 0.29 1.65 2.09 3.33 3.61/23/2009 1.29 0.79 1.74 2.14 3.22 3.451/26/2009 1.12 0.81 1.91 1.87 3.42 3.451/27/2009 1.31 0.72 1.94 2.01 3.45 3.451/28/2009 1.2 0.68 1.99 1.97 3.48 3.61/29/2009 1.07 0.79 1.92 2.1 3.3 3.61/30/2009 1.31 0.73 1.95 2.08 3.3 3.6

2/2/2009 1.34 0.53 2 1.93 2.98 3.452/3/2009 1.18 0.34 1.75 1.79 3.19 3.42/4/2009 1.19 0.78 2.02 1.74 3.46 3.52/5/2009 1.2 0.57 1.68 1.63 3.41 3.42/6/2009 1.69 0.61 1.99 1.93 3.4 3.452/9/2009 1.39 0.72 1.97 1.8 3.34 3.4

2/10/2009 1.59 0.69 1.93 1.95 3.39 3.352/11/2009 1.63 0.85 1.82 1.88 3.13 2.752/12/2009 1.61 0.83 1.79 1.81 3 3.32/13/2009 1.39 0.87 1.8 1.92 3.14 3.32/16/2009 1.2 0.86 1.78 1.9 3.15 3.252/17/2009 1.23 0.84 1.9 1.9 3.24 3.32/18/2009 1.39 0.69 1.4 1.87 3.16 3.252/19/2009 1.52 0.88 1.74 1.73 3.22 3.25

181

Page 182: Introduction to Econometrics 0

2/20/2009 1.66 0.85 1.86 1.86 3.22 32/23/2009 1.39 0.7 1.87 1.72 3.21 4.362/24/2009 1.63 0.87 1.8 1.83 3.04 3.22/25/2009 1.45 0.9 1.68 1.68 2.94 3.22/26/2009 1.42 0.91 1.68 1.68 2.93 3.252/27/2009 1.46 0.92 1.79 1.8 2.77 3.15

3/2/2009 1.45 0.9 1.71 1.78 2.69 3.053/3/2009 1.45 0.98 1.73 1.78 2.8 2.63/4/2009 1.44 0.76 1.69 1.56 2.85 2.93/5/2009 1.3 0.94 1.75 1.54 2.88 2.83/6/2009 1.44 0.9 1.71 1.7 2.81 2.553/9/2009 1.46 0.93 1.68 1.58 2.7 2.55

3/10/2009 1.27 0.52 1.62 1.68 2.86 2.553/11/2009 1.3 0.52 1.62 1.54 2.96 2.553/12/2009 1.6 0.52 1.64 1.52 2.93 2.553/13/2009 1.43 0.92 1.52 1.5 2.97 2.553/16/2009 1.7 0.94 1.52 1.49 2.87 2.553/17/2009 1.29 0.47 1.58 1.59 3 2.33/18/2009 1.64 1.01 1.45 1.39 3.02 2.33/19/2009 1.32 0.47 1.33 1.43 3.16 2.453/20/2009 1.25 0.48 1.32 1.46 3.17 2.453/23/2009 1.12 0.48 1.48 1.43 3.02 2.653/24/2009 1.28 0.48 1.49 1.53 2.94 2.453/25/2009 1.5 0.48 1.49 1.53 2.64 2.43/26/2009 1.55 0.94 1.49 1.25 2.65 2.33/27/2009 1.58 0.95 1.48 1.49 2.65 2.23/30/2009 1.45 0.65 1.45 1.48 2.58 2.13/31/2009 1.13 0.48 1.46 1.38 2.55 2.1

4/1/2009 1.45 0.48 1.44 1.46 2.6 2.054/2/2009 1.45 0.6 1.46 1.39 2.66 2.44/3/2009 1.44 0.59 1.53 1.42 2.66 1.54/6/2009 1.25 0.46 1.23 1.42 2.67 2.054/7/2009 1.45 0.57 1.38 1.42 2.82 2.054/8/2009 1.34 0.71 1.31 1.37 2.88 2.054/9/2009 1.07 0.49 1.18 1.29 2.84 1.9

4/14/2009 1.15 0.58 1.21 1.43 2.57 24/15/2009 1.22 0.59 1.23 1.37 2.61 2.054/16/2009 1.18 0.58 1.21 1.38 2.56 2.054/17/2009 1.25 0.38 1.14 1.39 2.5 2.054/20/2009 0.97 0.48 1.1 1.34 2.43 24/21/2009 0.97 0.52 1.15 1.34 2.54 2.74/22/2009 1.12 0.51 1.14 1.1 2.56 1.94/23/2009 0.95 0.38 1.24 1.34 2.43 2.24/24/2009 1.17 0.46 1.17 1.31 2.58 1.94/27/2009 0.91 0.46 1.13 1.28 2.41 2.14/28/2009 0.93 0.71 1.22 1.19 2.37 1.854/29/2009 1.2 0.66 1.21 1.3 2.43 1.854/30/2009 1.2 0.38 1.09 1.3 2.4 1.7

5/4/2009 1.19 0.61 1.19 1.11 2.38 1.855/5/2009 1.22 0.69 1.22 1.3 2.24 1.85/6/2009 1.08 0.59 1.07 1.23 2.18 1.75

182

Page 183: Introduction to Econometrics 0

5/7/2009 1.03 0.55 1.06 1.19 2.27 2.055/8/2009 1.11 0.64 1.18 1.21 2.2 2.25

5/11/2009 1.06 0.62 1.13 1.22 2.19 1.75/12/2009 1.04 0.61 1.13 1.22 1.97 1.75/13/2009 0.91 0.39 1.04 1.2 2.07 1.75/14/2009 0.95 0.4 1.03 1.18 2.13 1.455/15/2009 0.93 0.52 1.03 1.06 2.1 1.75/18/2009 1 0.59 1.16 1.22 1.93 2.15/19/2009 0.92 0.55 1.08 1.16 1.93 1.45/20/2009 0.84 0.5 1.07 1.16 2.04 1.775/22/2009 0.78 0.41 0.92 1.15 1.9 1.255/25/2009 0.56 0.39 0.92 1.2 1.89 2.355/26/2009 0.93 0.53 1.08 1.18 1.87 1.655/27/2009 0.6 0.3 0.95 1.26 1.83 1.75/28/2009 0.82 0.31 0.96 1.24 1.94 1.435/29/2009 0.56 0.44 0.91 1.15 1.99 1.65

6/1/2009 0.96 0.49 1.1 1.06 2.01 1.436/2/2009 0.96 0.53 0.99 1.12 2.02 1.656/3/2009 0.74 0.3 0.98 1.22 1.99 1.856/4/2009 0.8 0.31 0.92 1.18 1.92 1.556/5/2009 0.88 0.46 1 1.16 1.97 1.16/8/2009 0.79 0.31 1.01 1.15 1.96 1.56/9/2009 0.85 0.5 0.99 1.15 1.91 1.4

6/10/2009 0.85 0.32 0.76 1.17 1.79 1.56/11/2009 0.74 0.32 0.82 0.95 1.81 1.56/12/2009 0.82 0.37 0.76 1.22 1.76 1.55

You are required to construct a scatterplot of the different interest rates. Sketch a straight line or regression to fit the data. Is there a relationship between the different interest rates of different countries?

Exercise

It is given prices of international government bonds that matures in 10 years from different countries. The markets that are included are the United States market, US, the Japanese market rates denoted in euro as, JP, the United Kingdom market rates denotes as, GB, the euro market rates denoted as, EU, the Norway market rates denoted as, NO and the Danish market rates denoted as, DE. The France market is denotes as, FR. The Neterland market is denoted as, NL.

International Government Bonds, maturity 10 yearsUS 10Y JP 10Y DE 10Y NL 10Y FR 10Y GB 10Y EU 10Y NO 10Y

Date Value Value Value Value Value Value Value Value1/2/2009 2.3555 n/a 2.954 3.501 3.384 3.072 2.954 3.8911/5/2009 2.4829 1.223 3.016 3.585 3.474 3.17 3.016 3.961/7/2009 2.4958 1.266 3.175 3.737 3.69 3.301 3.175 4.0121/8/2009 2.4448 1.319 3.142 3.701 3.639 3.249 3.142 3.919

183

Page 184: Introduction to Econometrics 0

1/9/2009 2.3951 1.3 3.032 3.594 3.531 3.165 3.032 3.8321/12/2009 2.3111 1.309 2.996 3.605 3.46 3.156 2.996 3.8431/13/2009 2.2958 1.257 2.988 3.646 3.469 3.23 2.988 3.8321/14/2009 2.2076 1.276 2.937 3.581 3.478 3.148 2.937 3.7061/15/2009 2.2105 1.223 2.881 3.57 3.391 3.176 2.881 3.5761/16/2009 2.3388 1.233 2.944 3.642 3.515 3.259 2.944 3.5931/19/2009 2.3405 1.261 2.999 3.748 3.516 3.463 2.999 3.6451/20/2009 2.3821 1.233 3.047 3.811 3.6 3.476 3.047 3.6091/21/2009 2.5445 1.242 2.994 3.824 3.566 3.489 2.994 3.5751/22/2009 2.5956 1.247 3.07 3.906 3.631 3.531 3.07 3.5241/23/2009 2.6224 1.247 3.194 3.999 3.759 3.723 3.194 3.6531/26/2009 2.648 1.233 3.335 4.144 3.902 3.749 3.335 3.8191/27/2009 2.5291 1.276 3.271 4.034 3.816 3.678 3.271 3.8071/28/2009 2.6734 1.271 3.242 3.961 3.775 3.674 3.242 3.7611/29/2009 2.8709 1.276 3.225 3.909 3.759 3.72 3.225 3.7431/30/2009 2.8509 1.304 3.286 3.994 3.851 3.735 3.286 3.7372/2/2009 2.7191 1.304 3.278 3.956 3.797 3.73 3.278 3.7262/3/2009 2.8929 1.309 3.324 4 3.826 3.8 3.324 3.7262/4/2009 2.9389 1.357 3.363 3.925 3.84 3.815 3.363 3.7942/5/2009 2.9173 1.342 3.343 3.942 3.768 3.773 3.343 3.7542/6/2009 2.9953 1.342 3.377 3.955 3.764 3.784 3.377 3.7972/9/2009 2.9916 1.319 3.429 3.997 3.814 3.807 3.429 3.986

2/10/2009 2.8172 1.319 3.371 3.946 3.835 3.668 3.371 3.9442/11/2009 2.7572 1.304 3.202 3.943 3.699 3.416 3.202 3.8922/12/2009 2.7896 1.271 3.105 3.875 3.576 3.502 3.105 3.8052/13/2009 2.8986 1.271 3.116 3.89 3.655 3.598 3.116 3.832/16/2009 2.8949 1.304 3.054 3.862 3.621 3.537 3.054 3.7382/17/2009 2.6515 1.295 2.969 3.829 3.557 3.448 2.969 3.7182/18/2009 2.7644 1.266 2.978 3.825 3.553 3.462 2.978 3.7352/19/2009 2.8586 1.276 3.076 3.915 3.594 3.532 3.076 3.8622/20/2009 2.7915 1.28 3.021 3.844 3.533 3.447 3.021 3.8212/23/2009 2.759 1.285 3.017 3.822 3.521 3.47 3.017 3.8672/24/2009 2.8042 1.28 3 3.825 3.54 3.43 3 3.8262/25/2009 2.9263 1.314 2.986 3.792 3.565 3.475 2.986 3.8142/26/2009 3 1.29 3.136 3.931 3.71 3.671 3.136 3.8722/27/2009 3.0187 1.285 3.086 3.891 3.655 3.668 3.086 3.8663/2/2009 2.8697 1.309 3.039 3.841 3.644 3.576 3.039 3.8313/3/2009 2.8862 1.299 3.059 3.859 3.628 3.577 3.058 3.8313/4/2009 2.98 1.309 3.122 3.894 3.755 3.689 3.122 3.8723/5/2009 2.8151 1.323 3.033 3.801 3.655 3.376 3.033 3.8023/6/2009 2.8735 1.304 2.917 3.74 3.547 3.106 2.917 3.7553/9/2009 2.8608 1.309 2.939 3.769 3.606 3.137 2.939 3.767

3/10/2009 3.0118 1.314 3.011 3.836 3.674 3.124 3.011 3.8133/11/2009 2.9084 1.323 3.066 3.83 3.71 3.1 3.066 3.8773/12/2009 2.8608 1.342 3 3.756 3.631 2.981 3 3.8443/13/2009 2.8957 1.328 3.07 3.768 3.632 2.974 3.07 3.9123/16/2009 2.9566 1.309 3.156 3.831 3.705 3.046 3.156 3.9353/17/2009 3.0122 1.314 3.204 3.87 3.719 3.08 3.204 3.9763/18/2009 2.5276 1.314 3.221 3.894 3.752 3.128 3.221 4.0353/19/2009 2.6148 1.276 3.043 3.743 3.627 3.078 3.043 3.9293/20/2009 2.6362 1.276 2.984 3.664 3.551 3.062 2.984 3.876

184

Page 185: Introduction to Econometrics 0

3/23/2009 2.6523 1.276 3.033 3.721 3.602 3.153 3.033 3.9333/24/2009 2.7082 1.276 3.122 3.787 3.661 3.352 3.122 4.0413/25/2009 2.7935 1.299 3.147 3.792 3.654 3.33 3.131 3.9993/26/2009 2.7444 1.328 3.129 3.78 3.644 3.338 3.129 3.913/27/2009 2.7625 1.337 3.078 3.738 3.628 3.309 3.078 3.8633/30/2009 2.7171 1.342 3.032 3.763 3.636 3.199 3.032 3.8163/31/2009 2.6665 1.347 3.016 3.736 3.625 3.182 3.016 3.814/1/2009 2.6611 1.352 3.002 3.731 3.615 3.144 3.002 3.8044/2/2009 2.7807 1.38 3.118 3.831 3.692 3.411 3.118 3.9084/3/2009 2.889 1.423 3.197 3.825 3.744 3.464 3.197 3.9634/6/2009 2.9279 1.461 3.209 3.81 3.732 3.464 3.209 4.0164/7/2009 2.9002 1.437 3.226 3.861 3.739 3.465 3.226 4.014/8/2009 2.8596 1.46 3.198 3.853 3.688 3.379 3.198 3.9924/9/2009 2.9225 1.479 3.267 3.894 3.778 3.313 3.267 3.992

4/14/2009 2.79 1.465 3.216 3.827 3.711 3.221 3.216 3.984/15/2009 2.7717 1.437 3.155 3.756 3.637 3.281 3.155 3.9144/16/2009 2.8451 1.456 3.177 3.766 3.66 3.27 3.177 3.9554/17/2009 2.9507 1.451 3.264 3.851 3.759 3.383 3.264 4.0214/20/2009 2.8452 1.465 3.161 3.717 3.64 3.239 3.161 3.9974/21/2009 2.9006 1.461 3.124 3.707 3.616 3.349 3.124 3.924/22/2009 2.9397 1.437 3.216 3.771 3.663 3.458 3.216 3.9554/23/2009 2.9211 1.432 3.235 3.754 3.657 3.556 3.235 3.9494/24/2009 2.9997 1.413 3.199 3.738 3.628 3.525 3.199 3.9374/27/2009 2.9101 1.442 3.158 3.701 3.6 3.503 3.158 3.9074/28/2009 3.0204 1.418 3.159 3.676 3.593 3.486 3.159 3.9074/29/2009 3.1092 1.413 3.127 3.648 3.575 3.485 3.127 3.9424/30/2009 3.1131 1.427 3.216 n/a 3.612 3.514 3.216 3.9725/4/2009 3.1514 n/a 3.253 3.685 3.639 3.58 3.253 3.9965/5/2009 3.1687 n/a 3.204 3.659 3.589 3.592 3.204 4.0035/6/2009 3.165 1.399 3.24 3.666 3.624 3.644 3.24 4.0025/7/2009 3.3416 1.418 3.383 3.823 3.776 3.711 3.383 4.1415/8/2009 3.2893 1.451 3.433 3.85 n/a 3.768 3.433 4.158

5/11/2009 3.1764 1.461 3.376 3.795 3.735 3.687 3.376 4.0925/12/2009 3.1727 1.451 3.417 3.857 3.791 3.683 3.417 4.1165/13/2009 3.1213 1.458 3.332 3.775 3.73 3.529 3.332 4.0985/14/2009 3.0921 1.444 3.307 3.81 3.731 3.506 3.307 4.0495/15/2009 3.1378 1.463 3.369 3.841 3.775 3.556 3.369 4.1285/18/2009 3.228 1.415 3.372 3.825 3.782 3.57 3.372 4.1165/19/2009 n/a 1.43 3.432 3.864 3.813 3.49 3.432 4.1645/20/2009 3.199 1.425 3.499 3.854 3.81 3.581 3.499 4.1825/22/2009 3.451 1.435 3.546 3.964 3.909 3.724 3.546 4.2135/25/2009 3.368 1.455 3.604 3.998 3.926 n/a 3.604 4.2745/26/2009 3.551 1.44 3.632 4.03 3.965 3.687 3.632 4.3245/27/2009 3.748 1.47 3.625 4.006 3.965 3.753 3.625 4.335/28/2009 3.612 1.49 3.675 4.069 4.04 3.793 3.675 4.3435/29/2009 3.469 1.495 3.584 3.99 3.938 3.757 3.584 4.2996/1/2009 3.673 1.5 3.669 4.082 4.044 3.833 3.669 n/a6/2/2009 3.622 1.51 3.655 4.046 4.001 3.885 3.655 4.3176/3/2009 3.546 1.55 3.563 3.966 3.912 3.774 3.563 4.2926/4/2009 3.708 1.515 3.63 4.047 3.985 3.841 3.63 4.3176/5/2009 3.834 1.505 3.716 4.12 4.063 3.917 3.716 4.38

185

Page 186: Introduction to Econometrics 0

6/8/2009 3.876 1.515 3.675 4.074 4.018 3.868 3.675 4.3236/9/2009 3.86 1.53 3.636 4.023 3.965 3.85 3.636 4.304

6/10/2009 3.95 1.545 3.696 4.084 4.036 3.927 3.696 4.3396/11/2009 3.86 1.555 3.687 4.069 4.023 4 3.687 4.3676/12/2009 3.794 1.515 3.643 4.035 3.986 3.977 3.643 4.304

You are required to construct a scatterplot of the different international government bonds prices that matures in 10 years. Sketch a straight line or regression to fit the data. Is there a relationship between the different prices of different countries?

Exercise

It is given interest rates from different countries that matures in six months. The markets that are included are the United States market, US, the Japanese market rates denoted in euro as, JP, the United Kingdom market rates denotes as, GB, the euro market rates denoted as, EU, the Norway market rates denoted as, NO and the Danish market rates denoted as, DK.

Euro market rates, maturity 6 monthsUS 6M JP 6M GB 6M EU 6M NO 6M DK 6M

Date Value Value Value Value Value Value1/2/2009 1.9 1.14 2.25 2.93 3.17 4.51/5/2009 1.74 1.52 2.41 2.94 3.22 4.451/7/2009 1.54 1.08 2 2.8 3.36 4.41/8/2009 1.59 0.91 2.02 2.8 3.55 4.41/9/2009 1.48 0.98 2.01 2.61 3.47 4.4

1/12/2009 1.33 1.05 2.13 2.61 3.39 4.4

186

Page 187: Introduction to Econometrics 0

1/13/2009 1.48 0.98 2.09 2.55 3.19 3.451/14/2009 1.5 0.95 2.12 2.6 3.18 4.151/15/2009 1.55 1.62 2.86 2.5 3.05 3.81/16/2009 1.46 0.81 2.13 2.39 3.23 3.651/19/2009 1.46 1.61 2.88 2.44 3.31 3.91/20/2009 1.48 0.57 1.85 2.2 3.24 3.651/21/2009 1.84 1 1.94 2.24 2.88 3.61/22/2009 1.43 0.58 1.66 2.04 3.14 3.651/23/2009 1.54 0.93 2.03 2.24 3.2 3.551/26/2009 1.58 0.88 1.97 2.22 3.08 3.551/27/2009 1.46 0.85 1.97 2.17 3.23 3.551/28/2009 1.52 0.75 1.89 2.15 3.22 3.651/29/2009 1.55 0.86 1.97 1.94 3.08 3.651/30/2009 1.72 0.84 2 2.06 3.08 3.65

2/2/2009 2.01 0.65 2.06 2.15 2.84 3.082/3/2009 1.75 0.75 1.81 2.01 2.88 3.552/4/2009 1.78 0.8 2.06 2.02 3.22 3.52/5/2009 1.85 0.71 2.06 1.94 3.15 3.52/6/2009 1.95 0.68 2.13 1.94 3.11 3.552/9/2009 1.86 0.87 2.08 1.94 3.07 3.45

2/10/2009 1.77 0.66 1.98 1.95 3.15 3.452/11/2009 1.8 0.96 1.9 2.03 2.93 2.752/12/2009 1.72 0.66 1.91 1.88 2.87 3.42/13/2009 1.71 0.64 1.88 2 3 3.42/16/2009 1.89 0.96 1.92 2 2.96 3.32/17/2009 1.99 0.65 2.09 1.85 2.97 3.42/18/2009 1.64 0.69 2.24 1.87 2.96 3.352/19/2009 1.86 0.69 1.96 1.86 3.15 3.352/20/2009 1.85 0.69 2.05 1.82 3.08 32/23/2009 1.76 0.66 2.03 1.81 3.04 4.382/24/2009 1.74 0.63 2.06 1.79 2.92 3.352/25/2009 1.88 0.64 1.87 1.78 2.84 3.32/26/2009 1.91 0.64 1.94 1.9 2.92 2.872/27/2009 1.65 0.58 1.95 1.9 2.77 3.3

3/2/2009 1.94 0.98 1.92 1.88 2.72 3.13/3/2009 1.73 0.77 1.9 1.86 2.76 2.63/4/2009 1.74 1.03 1.88 1.67 2.8 2.783/5/2009 1.76 0.6 1.91 1.77 2.79 33/6/2009 1.91 0.67 1.94 1.74 2.77 2.73/9/2009 2.08 0.58 1.86 1.71 2.73 2.7

3/10/2009 1.95 0.75 1.69 1.68 2.79 2.73/11/2009 1.85 0.74 1.66 1.68 2.9 2.73/12/2009 1.95 0.75 1.94 1.66 2.82 2.73/13/2009 1.86 0.77 1.66 1.67 2.84 2.73/16/2009 1.96 1.04 1.69 1.66 2.78 2.73/17/2009 1.76 0.83 1.96 1.61 2.86 2.43/18/2009 1.9 0.84 1.56 1.54 2.92 2.43/19/2009 1.55 0.79 1.66 1.59 2.96 2.73/20/2009 1.59 0.79 1.57 1.58 3.05 2.73/23/2009 1.62 0.89 1.58 1.57 2.91 2.43/24/2009 1.61 0.78 1.68 1.68 2.86 2.6

187

Page 188: Introduction to Econometrics 0

3/25/2009 1.49 0.81 1.68 1.67 2.55 2.63/26/2009 1.49 0.78 1.59 1.56 2.51 2.63/27/2009 1.81 0.84 1.42 1.52 2.52 2.33/30/2009 1.63 0.85 1.41 1.56 2.5 2.43/31/2009 1.42 0.85 1.61 1.52 2.52 2.4

4/1/2009 1.61 0.86 1.57 1.62 2.54 2.34/2/2009 1.63 0.81 1.61 1.62 2.63 2.654/3/2009 1.56 0.83 1.47 1.61 2.66 2.254/6/2009 1.49 0.82 1.47 1.61 2.69 2.254/7/2009 1.7 0.79 1.25 1.59 2.8 2.254/8/2009 1.45 0.71 1.24 1.49 2.85 2.254/9/2009 1.46 0.67 1.22 1.51 2.86 2.1

4/14/2009 1.47 0.68 1.41 1.6 2.51 2.14/15/2009 1.37 0.61 1.55 1.6 2.6 2.364/16/2009 1.4 0.7 1.43 1.53 2.62 2.24/17/2009 1.4 0.62 1.31 1.53 2.58 2.24/20/2009 1.42 0.61 1.4 1.46 2.49 2.234/21/2009 1.34 0.54 1.47 1.48 2.5 2.84/22/2009 1.4 0.59 1.34 1.53 2.56 2.154/23/2009 1.35 0.6 1.29 1.53 2.51 2.44/24/2009 1.4 0.62 1.43 1.48 2.64 2.154/27/2009 1.33 0.6 1.38 1.44 2.55 2.14/28/2009 1.49 0.59 1.48 1.41 2.53 2.14/29/2009 1.5 0.63 1.44 1.5 2.6 2.14/30/2009 1.24 0.63 1.4 1.39 2.54 1.9

5/4/2009 1.44 0.54 1.41 1.38 2.53 2.15/5/2009 1.34 0.71 1.44 1.38 2.39 2.15/6/2009 1.3 0.74 1.23 1.39 2.39 2.15/7/2009 1.14 0.52 1.2 1.38 2.42 2.35/8/2009 1.34 0.44 1.4 1.38 2.4 2.35

5/11/2009 1.28 0.43 1.38 1.36 2.4 2.15/12/2009 1.21 0.36 1.36 1.38 2.23 2.15/13/2009 1.08 0.41 1.28 1.36 2.29 2.055/14/2009 1.01 0.38 1.17 1.35 2.27 1.65/15/2009 0.93 0.4 1.13 1.36 2.33 2.25/18/2009 1.11 0.39 1.13 1.32 2.18 2.055/19/2009 1.11 0.53 1.24 1.32 2.19 1.555/20/2009 1.05 0.44 1.25 1.36 2.27 2.055/22/2009 1.02 0.37 1.12 1.33 2.19 1.65/25/2009 1.05 0.41 1.14 1.35 2.19 2.55/26/2009 1.04 0.41 1.28 1.32 2.13 2.055/27/2009 1.07 0.43 1.19 1.33 2.08 2.055/28/2009 1.02 0.41 1.04 1.52 2.16 1.555/29/2009 0.99 0.4 1.11 1.3 2.18 1.95

6/1/2009 0.88 0.37 1.32 1.36 2.19 1.556/2/2009 1.18 0.34 1.29 1.32 2.21 1.956/3/2009 0.97 0.38 1.2 1.34 2.16 2.16/4/2009 1.07 0.36 1.11 1.41 2.18 1.556/5/2009 1.06 0.29 1.29 1.35 2.14 2.26/8/2009 1.08 0.38 1.32 1.36 1.98 1.956/9/2009 1.05 0.41 0.88 1.33 2.14 1.6

188

Page 189: Introduction to Econometrics 0

6/10/2009 0.98 0.38 1.21 1.3 2.03 1.956/11/2009 0.87 0.22 1.22 1.31 2.06 1.956/12/2009 1.11 0.2 0.87 1.35 2 1.95

You are required to construct a scatterplot of the different interest rates. Sketch a straight line or regression to fit the data. Is there a relationship between the different interest rates of different countries?

Exercise

You are given the monthly revenues in relation to the expenditures of an investment bank located in the U.K. Calculate the regression equation of revenues in relation to expenditures. Show the standard error figure from the whole regression from Excel. Calculate the mean of the independent variable x and Sxx.You are required to predict the revenues if the expenses are 10000 pounds. Calculate a 90% confidence interval for the predicted revenue.

Revenues,(y) Expenditures (x)7000 30008000 40008500 500010000 600012000 700013000 1000014000 1100018000 800019000 1200020000 1000021000 1800022000 19000

Total 172500 113000

189

Page 190: Introduction to Econometrics 0

The regression equation was calculated in Excel and it is as follows:

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.878429

R Square 0.771637Adjusted R Square 0.7488

Standard Error 2722.465Observations 12

ANOVA

df SS MS FSignificance

FRegression 1 250444337 250444337 33.78987 0.00016995Residual 10 74118163.21 7411816.32Total 11 324562500

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Intercept 5546.359 1710.08914 3.2433155 0.00882 1736.04188 9356.675 Expenditures (x) 0.937555 0.1612885 5.8129058 0.00017 0.57818161 1.296928

Solution

The regression equation is y=5546 . 359+0 .937555 x

If the expenses are 10000 pounds, the revenues are as follows:

y=5546 .359+0 . 937555∗10000=14 921 .909pounds . (1)

The standard error of the whole regression from Excel is 2722.465.

The mean of x is x=∑ x

n=

11300012

=9416 .667 pounds .

190

Page 191: Introduction to Econometrics 0

Sxx = ∑ x i2−n∗x2

= 1349000000 – 12 * 88673617.39 =1349000000 - 1064083409 = 284916591.3

The 90% confidence interval for the predicted revenue x = 10000 pounds is as follows:

First of all, we calculate the variance of the prediction error.

σ 2[1+ 1n+( x− x )2

Sxx ]

σ 2[1+ 112

+(10000−9416. 667 )2

284916591. 3 ]= σ2 [ 1+0 .08333+0. 001194 ]= σ2∗1.0845 (2)

The mean square of the residual valueσ2

=

RSSDegree of freedom =

74118163 . 2110 =

7411816.32 (3)

The standard error of the prediction from equations (2) and (3) is as follows:

SEprediction = √1. 0845∗7411816 .32=2835.2( to 1 d . p . ).

The 90% confidence interval from the t-distribution with 10 degrees of freedom. The formula is n-2 or 12 – 2 = 10 is 1.812. We use the estimation from equation (1).

14921.909 ±1.812*2835.2

The upper confidence level is 14921.909 + 1.812*2835.2 = 20059.29 poundsThe lower confidence level is 14921.909 - 1.812*2835.2 = 9784.527 pounds

191

Page 192: Introduction to Econometrics 0

Exercise

You are given the following dataset

∑ x i=200

∑ x i2=900

∑ yi=100

∑ yi2=400

∑ x i y i=550 n = 50

Calculate x , y ,Sxx, Sxy, Syy, the coefficients α , { β ¿ , and y

You are given the following formulas:

x=∑ xn (1)

y=∑ yn (2)

Sxx = ∑ x i2−n x2

(3)

Sxy = ∑ x i y i−n x y (4)

Syy = ∑ yi

2−n y2 (5)

α= y− β x (6)

β=S xy

S xx (7)

y=α+ β x (8)

192

Page 193: Introduction to Econometrics 0

Solution

x=∑ xn =

20050

=4 (1)

y=∑ yn

=10050

=2 (2)

Sxx = ∑ x i2−n x2

= 900 – 50(4)2 = 900 – 800 = 100 (3)

Sxy = ∑ x i y i−n x y = 550 – 50*8 = 150 (4)

Syy = ∑ yi

2−n y2 = 400 – 50 *4 = 200 (5)

β=S xy

S xx =

150100

=1. 5 (6)

α= y− β x = 2 – 1.5*4 = -4 (7)

y=α+ β x = -4 +1.5x (8)

193

Page 194: Introduction to Econometrics 0

Exercise

You are given the following dataset

∑ x i=400

∑ x i2=800

∑ yi=200

∑ yi2=600

∑ x i y i=750 n = 30

Calculate x , y ,Sxx, Sxy, Syy, the coefficients α and { β¿ , y , and r xy2

.

You are given the following formulas:

x=∑ xn (1)

y=∑ yn (2)

Sxx = ∑ x i2−n x2

(3)

Sxy = ∑ x i y i−n x y (4)

Syy = ∑ yi

2−n y2 (5)

α= y− β x (6)

194

Page 195: Introduction to Econometrics 0

β=S xy

S xx (7)

y=α+ β x (8)

r xy2 =

β Sxy

S yy (9)

Exercise

Please consider the following two variables x and y related to sales and advertising expressed in thousands USD.

x y2 14 77 49 210 93 86 41 55 2

You are required to calculate the variances of α and { β¿ .Calculate the residual sum of squares.

The standard errors of α and { β¿ .

The mathematical formulas for Sxx, Sxy, Syy are as follows:

Sxx = ∑ x i2−n x2

= 321 – 9*(5.22)2 = 75.7644 (1)

Sxy = ∑ x i y i−n x y = 229 – 9 *5.22*4.67 = 6.6034 (2)

Syy = ∑ yi

2−n y2 = 260- 9*(4.67)2 = 63.7199 (3)

The mathematical formulas for the variances of α and { β¿ are as follows:

195

Page 196: Introduction to Econometrics 0

V ( α )=σ2 ( 1n+ x2

S xx)=

σ 2( 19+ 5 .222

75 .7644 )=σ2 (0. 111+0 .359646 )=σ2 0. 470646 (4)

V ( β )= σ2

Sxx =σ 2

75 .7644=0.013

σ 2 (5)

We calculate σ2

for σ2

with the following formula.

σ 2= 1n−2 (S yy−

Sxy2

Sxx)=

17 (63 .7199−6 . 60342

75. 7644 )=0.142857*63.144 = 9.0206 (6)

From(4 ) and (6) we have the satndard error of { α . ¿ SE ( α )=√9 .0206∗0. 470646=2 . 06 ¿ From (5 ) and (6 ) we have the satndard error of { β .¿ ¿SE ( β)=√9 .0206∗0. 013=0. 34 ¿¿

196

Page 197: Introduction to Econometrics 0

Exercise

From the data given from a random sample, the result was a follows:

y x60 7055 90100 12098 130114 170120 198140 210150 230120 240

It is required to draw the scattegram with y the dependent variable on the vertical axis and x the independent variable on the horizontal axis.

Please comment on the relationship between y and x.

Exercise

You are given data on the consumer price index, (CPI) and the FTSE all share index stock prices for the following 10 years.

Year CPI FTSE all share index1 63.7 2543.472 71.9 3033.48

197

Page 198: Introduction to Econometrics 0

3 80.2 3142.234 84.3 3542.125 97.5 3674.326 101.2 3712.567 105.3 3812.118 110.3 3912.109 114.5 4000.2310 123.4 4123.23

a) Please plot the data on a scattergram with the FTSE all share index on the vertical axis and CPI on the horizontal axis.

b) Please comment on the relationship between the two indexes based on economic theory.

c) Conside the following regression model:

y = α +β1 xt+εt

Where: y is the dependent variable and is represented by the FTSE all share index. xt is the independent variable and is represented by the CPI.

Use the least squares method to estimate the equation and interpret your results.

d) Do the results obtained from the regression equation make economic sense?

Exercise

It is given data on nominal interest rates and inflation rates for 10 consecutive years.

Year Interest rates Inflation rates1 8 72 7 63 6 5

198

Page 199: Introduction to Econometrics 0

4 5 45 4 36 3 27 2 18 1.20 1.239 0.50 1.5010 0.25 1.70

Plot the data interest rates on the vertical axis as the dependent variable and inflation rates on the horizontal axis as the independent variable. Sketch the scattegram.

Show your calculations related to the least squares regression equation.

Please state the difference between nominal and real interest rates and how they are related to inflation. Please explain the economic reasoning.Exercise

Please consider the following demand function for Gold futures contracts. I have included the coefficient of determination that measures the goodness of fit of a regression equation and the standard errors of the intercept and the independent variable. The dependent variable is contract sales and the independent variable is disposable income.

y=401 + 2. 12 x t r2=0 .88SE (1.45 ) (2 .34 )

(a) Test the hypothesis that the interval for the coefficient β of the independent variable is equal to zero. Is the null hypothesis acceptable?

(b) Construct a 95% confidence interval for the coefficient β .(c) Calculate the t – value for the coefficient β . Is it statistically significant at the

95% confidence level.

Exercise

Please consider the following regression equation related to investment of Danske bank security. The equation shows the relationship of the security in relation to Copenhagen stock exchange.

rt=α+βrmkt+εt

199

Page 200: Introduction to Econometrics 0

Where: r is the rate of return of the bank security. rm is the rate of return on the market portfolio related to Copenhagen stock exchange. t is the time span.

ε t is the error term.

After downloading the relevant data and conducting data analysis regression, we obtained the following output.

rt= 0 .875+ 2 . 956rmtSE (0.472) (3.213) r2 = 0.95t-statistics (1.85) (0.92)

Please calculate the following:

(a) How you interpret the coefficient of determination. Is it strong positive relationship between the variables.

(b) Is the coefficient β of the market return significant?(c) Is the security aggressive or defensive? A beta greater than 1 represents a

volatile security.

Exercise

Please consider the following regression equation related to the gross national product, GNP in relation to the personal consumption expenditures, net exports of goods and services nominated in billions dollars. The following data and results were obtained from the US Federal reserve bank for the dates 1980 – 2012.

Date Gross domestic product Personal consumption expenditures

Net exports of goods and services

1980 2862.5 1754.6 -13.11981 3210.9 1937.5 -12.51982 3345 2073.9 -201983 3638.1 2286.5 -51.71984 4040.7 2498.2 -102.71985 4346.7 2722.7 -1141986 4590.1 2898.4 -131.91987 4870.2 3092.1 -144.81988 5252.6 3346.9 -109.41989 5657.7 3592.8 -86.81990 5979.6 3825.6 -77.91991 6174 3960.2 -28.61992 6539.3 4215.7 -34.81993 6878.7 4471 -65.21994 7308.7 4741 -92.51995 7664 4984.2 -89.81996 8100.2 5268.1 -96.41997 8608.5 5560.7 -1021998 9089.1 5903 -162.71999 9665.7 6316.9 -261.4

200

Page 201: Introduction to Econometrics 0

2000 10289.7 6801.6 -380.12001 10625.3 7106.9 -3692002 10980.2 7385.3 -4252003 11512.2 7764.4 -500.92004 12277 8257.8 -614.82005 13095.4 8790.3 -715.72006 13857.9 9297.5 -762.42007 14480.3 9744.4 -709.82008 14720.3 10005.5 -713.22009 14417.9 9842.9 -392.22010 14958.3 10201.9 -518.52011 15533.8 10711.8 -568.72012 16244.6 11149.6 -547.2

The regression equation is as follows:

GNP = 501.64 + 1.4257β1 + 0.023β2 r2 = 0.999SE (48.0267) (0.0137) (0.1611)t-statistics (10.445) (103.845) (0.144)

Please calculate the following:

(a) Interpret the r2.(b) Interpret the t-statistics and if they are significant at the 5% significance level.(c) Please forecast the GNP figure if the personal consumption figure is 13211.4

and the net exports of goods and services figure is -521.1.

201

Page 202: Introduction to Econometrics 0

Exercise

Please consider the following data related to different categories of hedge funds. The categories are conservatives, diversified, equity and event driven. I have included also data relevant to the hedge funds index return. Run each regression separately. The dependent variables are conservative return fund, diversified return fund, equity return fund and event driven return fund. The independent variable is hedge funds index return.

Conservative return

Diversified return

Equity return

Event driven return

Hedge funds index return

1.57% -1.15% -0.96% 0.15% 1.49%3.16% 0.23% 0.87% 0.12% 0.53%3.90% -1.32% 3.63% 0.11% 0.36%

-0.47% -3.24% -0.08% 0.10% 0.38%-0.61% -1.67% 0.15% 0.18% -0.09%-1.10% -0.76% -1.80% -0.08% 0.18%-1.21% 4.60% 0.22% 0.41% 1.68%-7.34% -3.83% -3.22% -1.30% 0.36%0.70% -0.21% -2.69% -4.44% -0.37%0.06% -1.92% -3.06% -3.33% -0.82%1.06% -0.13% 1.88% 0.28% 0.44%

-0.43% 0.06% 3.21% 0.40% 2.27%-0.62% 0.10% 0.53% 1.23% 0.91%0.45% -1.06% 0.31% 0.55% 0.70%1.78% -1.33% 1.20% 0.73% -0.03%3.23% -0.17% 4.27% 2.14% -0.18%3.16% -0.39% 0.66% 1.82% -0.40%0.27% -1.10% 3.13% 1.60% -0.36%3.05% -1.05% 0.51% 1.10% 0.57%0.94% -0.59% 0.30% 0.14% 0.47%2.33% -2.11% 0.72% 0.30% 0.15%0.91% 1.23% -0.45% 0.35% 0.07%4.60% 1.11% 1.84% 1.19% 0.99%5.12% -1.04% 4.59% 1.52% 0.68%0.79% 1.27% 0.19% -1.03% 0.80%8.32% 1.05% 2.75% 1.91% -0.16%

-0.68% -1.64% -0.42% 1.71% 1.01%-3.15% -0.34% -0.23% 0.42% 1.23%-3.69% -0.62% 0.82% 1.20% 0.44%

202

Page 203: Introduction to Econometrics 0

3.44% -5.85% 1.47% 1.23% 0.18%0.04% 0.81% 0.49% 0.54% 0.32%3.70% -0.78% 2.07% 1.16% -0.27%3.58% -0.32% -0.26% 0.56% 0.70%

-3.47% -0.11% 0.04% 0.76% 0.82%-3.09% -0.34% 1.16% 0.96% 0.82%-0.80% -2.24% 2.45% 0.41% 1.33%1.82% -1.52% 1.38% 0.86% 0.12%

-0.30% -0.90% 0.54% 1.25% 0.91%-5.16% -0.52% 1.70% 0.99% -0.15%0.72% 0.44% -0.35% 0.50% -0.16%1.29% 0.44% 0.20% 1.15% -0.14%

-0.02% 0.03% 0.16% 0.39% -0.83%-0.27% -0.03% 0.25% 0.40% 0.56%0.04% -1.26% 1.04% 0.74% 0.99%

-2.41% -1.46% 0.05% -0.18% 1.07%0.61% 0.88% 1.62% 1.01% 1.00%1.73% -4.68% -0.76% 0.72% 0.67%0.77% 1.67% 0.71% 0.24% 0.74%2.37% 1.67% 0.85% 0.90% 0.18%0.10% -1.24% -0.10% 0.51% 0.12%1.30% -1.69% 0.65% 0.76% 0.55%0.43% -1.18% 0.48% 0.66% 1.74%0.86% -2.33% 0.91% 0.67% 0.74%

-0.52% -1.92% 0.46% 0.03% 0.73%-3.39% -1.47% 0.29% -1.29% -0.36%0.87% -0.92% 0.96% -0.23% 0.76%

-0.92% 0.16% 0.43% -0.04% 0.17%0.01% 0.22% -0.31% -2.27% 0.06%3.95% 0.14% 0.59% 1.16% 0.89%0.47% 0.56% 1.06% 0.72% 0.40%

-0.38% -0.70% 2.06% 1.10% 0.76%0.46% 0.31% 2.13% 1.12% 0.24%

-0.97% -0.46% 0.95% 0.77% 0.11%0.07% -1.50% -0.97% 0.37% 0.23%2.35% -6.79% 1.19% 0.76% 0.53%1.18% 0.20% 2.60% 2.08% -0.75%0.17% 0.20% 0.27% 1.23% 1.12%0.51% 0.11% -0.40% 0.34% 0.37%0.17% 0.12% 1.70% 0.00% 0.48%0.85% 0.73% 0.80% 0.93% 3.38%0.65% 0.66% 1.50% 0.83% 0.33%0.79% 0.69% 0.40% 0.56% 0.29%

It is required to calculate the following:

(a) Plot in a scattergram the returns of the different categories of hedge funds. Is there any correlation between them?

(b) Plot in a scattergram the returns of each hedge fund in relation to the hedge funds index.

203

Page 204: Introduction to Econometrics 0

(c) Construct a 95% and 99% confidence interval for the slope of each fund and test the hypothesis that the true slope coefficient is zero, namely that there is no relationship between them.

(d) Construct in Excel a table with summary statistics for each fund. Comment on the relationship between the mean and the standard deviation.

Exercise

It is given the following dataset. Please calculate the residual sum of squares, RSS.

Months y x

1 1.5 1

2 2 2.5

3 1 0

4 2 3

5 3.5 4

6 1.5 2

Solution

The first thing is to run the regression equation in Excel. Please do not forget to check the box for the residual output. Then, do some calculations in the residual output table to get the same figure 0.7 as in the summary output. You should square the numbers of the residuals and then add them up.

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.900686

R Square 0.811236Adjusted R Square 0.764045

Standard Error 0.41833

Observations 6

ANOVA

df SS MS FSignificance

F

Regression 1 3.0083333.00833

317.19047

6 0.014305018Residual 4 0.7 0.175

204

Page 205: Introduction to Econometrics 0

Total 5 3.708333

Coefficients

Standard Error t Stat P-value Lower 95%

Upper 95%

Intercept 0.785714 0.3218252.44143

20.071098

6 -0.1078174331.67924

6x 0.542857 0.130931 4.14614 0.014305 0.179334394 0.90638

RESIDUAL OUTPUTObservation Predicted y Residuals Residuals

squared1 1.32857143 0.17142857 0.029387755

2 2.14285714-

0.14285714 0.0204081633 0.78571429 0.21428571 0.045918367

4 2.41428571-

0.41428571 0.1716326535 2.95714286 0.54285714 0.294693878

6 1.87142857-

0.37142857 0.137959184Total 0.7

Exercise

It is given the following data related to short – run costs and output. The cost figures are nominated in pounds.

Output Total fixed cost Total variable cost

Total cost

10 150 0 15020 150 50 20030 150 100 25040 150 150 30050 150 200 35060 150 250 40070 150 300 450

205

Page 206: Introduction to Econometrics 0

80 150 350 50090 150 400 550

100 150 450 600110 150 500 650120 150 550 700130 150 600 750140 150 650 800150 150 700 850160 150 750 900170 150 800 950180 150 850 1000190 150 900 1050200 150 950 1100

It is required to calculate the following:

(a) Plot in a scattergraph the total fixed cost in relation to the variable cost.(b) Construct a 95% and 99% confidence interval for the total variable cost.(c) Construct in Excel a table with summary statistics for each cost.(d) Run the regression equation. The dependent variable is total cost and the

independent variable is variable cost.

Exercise

It is given the total fixed cost, (TFC), the total variable cost, (TVC) and the total cost, (TC) in relation to the quantity from the previous exercise.

Output Total fixed cost Total variable cost

Total cost

10 150 0 15020 150 50 20030 150 100 25040 150 150 30050 150 200 35060 150 250 40070 150 300 45080 150 350 50090 150 400 550

100 150 450 600110 150 500 650120 150 550 700130 150 600 750140 150 650 800150 150 700 850160 150 750 900170 150 800 950180 150 850 1000

206

Page 207: Introduction to Econometrics 0

190 150 900 1050200 150 950 1100

Calculate the average fixed cost,(AFC), the average variable cost, (AVC), the average, (AC) and marginal cost, (MC). The mathematical formuals are as follows:

AFC =

TFCQ (1)

AVC =

TVCQ (2)

AC =

TCQ (3)

MC =

ΔTCΔQ (4)

Construct a new table and calculate the above costs by applying the mathematical formulas. Plot a scattergraph to see the relationship between the average fixed and variable cost. Please comment on the relationship between the marginal and average cost. Could you see that as output related to the quantity get bigger the fixed cost value is dropping.

207

Page 208: Introduction to Econometrics 0

Exercise

It is given the marginal revenue, the average revenue, and the total revenue based on the following dataset.

Quantity (units) Average revenue (pounds)

Total revenue (pounds)

0 10 0100 10 500200 10 1000300 10 1500400 10 2000500 10 2500600 10 3000700 10 3500800 10 4000900 10 45001000 10 50001100 10 55001200 10 60001300 10 65001400 10 70001500 10 75001600 10 80001700 10 85001800 10 9000

Plot a scattergraph to see the relationship between quantity and total revenue. Please comment on the relationship between the quantity and total revenue.

Run a linear regression equation.Use as the dependent variable total revenue and as the independent variable the quantity. Commnet on the R2 and the F-statistic. When the average revenue is equal to the price and equal to the marginal revenue?

Hint: You should consider equality when the price is constant or changes? Check the economics notes of the seminar group.

208

Page 209: Introduction to Econometrics 0

209