11
A Multivariate Model of Total Expenditure among households in the Philippines Alyssa Ann Angeles 2011-43922 John Vincent R. Pardilla 2011-03625 Marcus J. Valdez 2011-40663 In partial fulfillment of the requirements for Econ 131 Econometrics 1st Semester, AY 2013-2014 School of Economics University of the Philippines Diliman, Quezon City October 14, 2013

A Multivariate Model of Total Expenditure Among Households in the Philippines - Pardilla Et Al

Embed Size (px)

DESCRIPTION

Econometrics

Citation preview

  • A Multivariate Model of Total Expenditure among households in the Philippines

    Alyssa Ann Angeles

    2011-43922

    John Vincent R. Pardilla

    2011-03625

    Marcus J. Valdez

    2011-40663

    In partial fulfillment of the requirements for

    Econ 131 Econometrics

    1st Semester, AY 2013-2014

    School of Economics

    University of the Philippines

    Diliman, Quezon City

    October 14, 2013

  • I. Introduction

    Do total income and family size have a positive impact on a households total

    expenditure? This question is what this paper will be trying to answer.

    Population has drastically increased in a matter of years. From a population of 88.55M in

    August 2007, it has increased to 92.34M in May 2010. [1] According to the National Statistics

    Office, Philippine population has an average annual growth rate of 1.90% during the period

    2000-2010. This means that, each year, two persons are added for every 100 persons in the

    population. [2] With this said, it could easily be deduced that an increase in the population would

    mean an increase in family size in that additions to the population must belong to a certain

    household or family. On the other hand, the average total income of families in both bottom 30%

    and top 70% of the population has increased by 13 thousand pesos and 42 thousand pesos

    respectively from 2006 to 2009. From an average of 49 thousand pesos, average annual income

    of poor families, those who belong to the bottom 30% of the population, has increased to 62

    thousands pesos, while the average annual income of non-poor families, top 70%, has increased

    from 226 thousand pesos to 268 thousand pesos.[3] Lastly, total expenditure has also been

    increasing from 2000 to 2009 as shown in the 2009 FIES.[4]

    [1] National Statistics Office. Philippines in Figures. Retrieved from http://www.census.gov.ph/ [2] National Statistics Office. (2012). Population grew by 1.90 percent annually. In The 2010 Census of Population

    and Housing Reveals the Philippine Population at 92.34 Million. Retrieved from

    http://www.census.gov.ph/content/2010-census-population-and-housing-reveals-philippine-population-9234-

    million [3] National Statistics Office. (2011, Feb 4). Families in the Bottom 30% Income Group Earned 62 Thousand Pesos in

    2009 (Final Results from the 2009 Family Income and Expenditure Survey). Retrieved from

    http://www.census.gov.ph/content/families-bottom-30-percent-income-group-earned-62-thousand-pesos-2009-

    final-results-2009 [4] National Statistics Office. (2009). Percent Distribution of Annual Family Expenditures by Expenditure Group,

    Philippines: 2000, 2003, 2006, 2009 [Data file]. Retrieved from

    http://www.bles.dole.gov.ph/PUBLICATIONS/Yearbook%20of%20Labor%20Statistics/STATISTICAL%20TABLES/PDF

    /CHAPTER%2012/Table%2012-3.pdf

  • At certain points between 2003 and 2009, it can be observed that all three variables

    family size, total income and total expenditure have been increasing. However, to say that there

    is a relationship, particularly a positive relationship, among the three variables based on intuition

    and just by using the evidence provided above to support it is not enough. By using data from the

    2009 Family Income and Expenditure Survey, this study aims to determine if there exists any

    relationship among a households total expenditure, total income and family size and which of

    the two independent variables, total income and family size, have a greater impact on total

    expenditure. Moreover, this paper hypothesizes that there is a positive relationship among total

    expenditure, total income and family size; that is, an increase in either or both total income and

    family size would lead to an increase in total expenditure. However, specific components that

    make up total expenditure will not be explored or thoroughly discussed in this paper.

    II. Methodology

    This study uses data from the 2009 Family Income and Expenditure Survey (FIES)

    conducted by the National Statistics Office (NSO) every three years. There is a total of 38 400

    observations in this survey which could be considered large enough to constitute the whole

    population. Only cross-sectional data are to be used in this study. This paper will also be limited

    in establishing relationships within the used data set. No time series data are used in this study.

    Data

    Variable Observations Mean Std. Dev. Min Max

    Family Size 38400 47.45872 21.56525 10 200

    Total Income 38400 195811.5 .4976421 0 3.04e+07

    Total

    Expenditure

    38400 165984.9 164981.7 9250 4108871

  • Empirical Specification

    Analytical Model

    In this study, total expenditure was modeled as a function of total income and number of

    members in the family

    Total Expenditure= totex(toinc, fsize)

    Total income was selected to be part of the model because theoretically, total income affects total

    expenditure. The amount of income that a family is earning more or less determines their

    expenditure. The family size is selected because the number of family members increases

    expenditure, at least in the short run. In the long run, new family members become assets since

    they themselves earn income. In our paper, the analysis is mostly static, involving only the year

    2009 which eliminates the long run situation which can be problematic.

    III. Regressions and Estimations

    Preliminary Regression

    OLS regression is first used in the model. Total expenditure is regressed against total

    income and family size. The result of the preliminary regression using Stata is as follows:

    Remarks Expected Effect Variable

    It is hypothesized that as family size increases,

    there are more expenses (such as food

    expenses, education, etc) in the family thus

    increasing the total expenditure

    + Family Size

    As Keynes stated, men [women] are disposed,

    as a rule and on average, to increase their

    consumption as their income increases, but not

    as much as the increase in their income. Keynes

    postulated that the Marginal Propensity to

    Consume is greater that zero but less than 1.

    +

    s.t. 0

  • . reg totex toinc fsize

    Source | SS df MS Number of obs = 38400 -------------+------------------------------ F( 2, 38397) =23585.31 Model | 5.7617e+14 2 2.8809e+14 Prob > F = 0.0000 Residual | 4.6901e+14 38397 1.2215e+10 R-squared = 0.5513 -------------+------------------------------ Adj R-squared = 0.5512 Total | 1.0452e+15 38399 2.7219e+10 Root MSE = 1.1e+05

    ------------------------------------------------------------------------------ totex | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- toinc | .4145162 .0019521 212.34 0.000 .41069 .4183423 fsize | 660.4736 26.27325 25.14 0.000 608.9773 711.9698 _cons | 53472.66 1388.513 38.51 0.000 50751.14 56194.18

    In this regression, toinc, totex and fsize are the variables for total income, total

    expenditure and family size respectively. Looking at the result of the first run, a value of 0.5513

    was obtained for the R2

    after the initial regression. This means that about 55% of the variation in

    total expenditure can be explained by the two regressors, total income and family size. Although

    this value of the R squared is relatively low, such case is typically observed in cross-section data

    with large number of observation; therefore, it cannot be concluded the model is not a good fit.

    Proceeding to the examination of the coefficients, all coefficients are individually highly

    significant since p-values are low. The F value, on the other hand, is very high which suggests

    that collectively, all the variables are statistically significant as well. The first coefficient, the

    coefficient of the variable total income, is positive which indicates that as total income increases

    total expenditure also increases. Keeping all other factors equal, if total income of the family

    increases by a peso, then, total expenditure also increases by 0.4 pesos or 40 centavos. This can

    be further interpreted as the marginal propensity to consume. If the total income of the family

    increases by 100, for example, the family would most likely spend about 41% of the increase in

    total income. As emphasized earlier, the components of a familys expenditure will not be

    explored or discussed in this paper. The second coefficient, which describes the family size, also

  • has a positive coefficient of 600 which means that holding the influence of total income constant,

    total expenditure will increase by 600 pesos as family size increases by 1. This 660 peso increase

    in total expenditure for a unit increase in family size cannot be compared to the 0.4 peso increase

    in total expenditure for every peso increase in total income. We cannot state with certainty that

    expenditure increases more with an increase in income rather than with an increase with family

    size. However, standardizing these variables in a auxiliary regression (which will be done in the

    latter part of this paper) will allow for the comparison of the influence of each independent

    variable on total expenditure. On the other hand, the constant term, for the most part, has no

    economic sense. What it should mean is that if income is 0 and family size is 0, then expenditure

    would be 53,473. This cannot be interpreted that way since if family size is zero, then, there

    should be no expenditure at all.

    Estimation Issues

    Despite having observations equal to 38400, which some may consider large enough to

    represent the whole population, the model used in this paper is not rid of estimation issues. It is

    therefore customary and appropriate to test for violations namely heteroscedasticity,

    autocorrelation, and multicollinearity and remedy these in order to obtain estimations that are

    free of such violations.

    1. Heteroscedasticity

    From the preliminary regression, a test was done to check for the presence of

    heteroscedasticity in the data used. The result is as follows:

    . hettest Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of totex

    chi2(1) = 1.80e+07 Prob > chi2 = 0.0000

  • The test results seem to point out the presence of heteroscedasticity in the data. The p value,

    which is the value described in the second column, is very small at 0. Thus, we reject the null

    hypothesis that the data exhibits constant variance across all values. Another regression using

    Whites Heteroscedasticity-Consistent Variances and Standard Errors, also known as robust

    standard errors, must be run again in order not to suffer from the consequences of interpreting

    the results with the presence of heteroscedasticity. Using the robust regression instead of the

    normal least squares eliminates the influence of existing outliers in the data which may have

    caused the presence of heteroscedasticity. Below is the result of the robust regression:

    Linear regression Number of obs = 38400 F( 2, 38397) = 1601.60 Prob > F = 0.0000 R-squared = 0.5513 Root MSE = 1.1e+05 ------------------------------------------------------------------------------ | Robust totex | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- toinc | .4145162 .1060899 3.91 0.000 .2065773 .622455 fsize | 660.4736 148.4128 4.45 0.000 369.5807 951.3665 _cons | 53472.66 13523.12 3.95 0.000 26966.99 79978.33 ------------------------------------------------------------------------------

    There is not much change in the coefficients; the coefficients are still individually significant and

    no noticeable change in the value of R2

    after the regression. But as expected, there are some

    changes in the estimated standard errors. Whites heteroscedasticity-corrected standard errors are

    considerably larger than the OLS standard errors. Therefore, the estimated t values are much

    smaller than those obtained by OLS.

    2. Autocorrelation

    As stated in the first part of this paper, no time-series data shall be used where

    autocorrelation is more likely to occur. Therefore, it is assumed that the presence of

    autocorrelation in this papers model is unlikely.

  • 3. Multicollinearity

    To check for multicollinearity, a VIF test was done. In this test, it is assumed that a VIF

    of more than 10 will prompt for further investigation. The tolerance, which is 1/VIF, is another

    measure of multicollinearity. In the data used in this study, if tolerance is less than 0.1, then there

    is multicollinearity. The result is as follows:

    Variable | VIF 1/VIF -------------+---------------------- fsize | 1.01 0.990891 toinc | 1.01 0.990891 -------------+---------------------- Mean VIF | 1.01

    The result shown above shows no sign of multicollinearity in the data used for this study. The

    VIF of the family size and toinc are less than 10, while the tolerance, as described by the third

    column, is not less than 0.1. Therefore, the data is free of multicollinearity and the regression

    estimates are not troublesome

    Standardizing Variables

    As aforementioned in this paper, the variables cannot be compared in their current values

    since family size and total income are represented in different scales. To allow a comparison of

    the influence of total income and family size on total expenditure, variables expressed as

    deviations from the mean are generated. This is shown by the table below:

    Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ztotex | 38400 5.59e-06 1.000005 -.9500124 23.89904 ztoinc | 38400 1.72e-06 1.000002 -.6470511 103.9544 zfsize | 38400 1.83e-07 .9999999 -1.736995 7.073476

    As shown by the table above, the variables are indeed standardized since their standard

    deviations are very close to 1. Furthermore, the means are almost zero, denoted by their mean

    being represented as a function of the natural logarithm e. Regressing the standardized variable

    gives the following result:

  • Source | SS df MS Number of obs = 38400 -------------+------------------------------ F( 2, 38397) =23585.31 Model | 21168.298 2 10584.149 Prob > F = 0.0000 Residual | 17231.0478 38397 .448760263 R-squared = 0.5513 -------------+------------------------------ Adj R-squared = 0.5512 Total | 38399.3458 38399 1.000009 Root MSE = .6699 ------------------------------------------------------------------------------ ztotex | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ztoinc | .7292455 .0034343 212.34 0.000 .7225143 .7359768 zfsize | .0863328 .0034343 25.14 0.000 .0796016 .0930641 _cons | 4.32e-06 .0034185 0.00 0.999 -.0066961 .0067048 ------------------------------------------------------------------------------

    After the regression, the variables have remained significant which would allow for

    measurement and comparison of the influence of total income and family size on total

    expenditure. The variables ztotex, ztoinc and zfsize are standardized versions of the variables

    total expenditure, total income, and family size respectively, as denoted by the letter z preceding

    each variable. Therefore, an increase of 1 peso in toinc (total income) will increase total

    expenditure by 0.7 pesos or 70 centavos, that is, assuming all other variables are kept constant. In

    the same manner, total expenditure increases by 0.09 pesos or 9 centavos when family size

    increases by 1. With this interpretation using the coefficients of the standardized variables, it can

    be said an increase in total income results to a greater increase in total expenditure than with an

    increase in family size. Comparing this to the result of the preliminary regression, that is, before

    standardizing the variables, to say that family size has a greater impact on total expenditure with

    a coefficient of 660 against the coefficient of 0.41 of total income is very erroneous.

    Residuals

    Examining the normality of the error terms obtained from the robust regression is deemed

    necessary. The results of the histogram and normal probability plot is given below; the histogram

    of the error terms is similar to a bell-shape distribution while the in the normal probability plot,

    the residuals seems to follow the straight line. Thus, this rejects the hypothesis that the error term

  • is not normally distributed. This is crucial, since from the robust regressions; if the error terms

    are normally distributed then usually we cannot use the usual t and F tests. However, it is not the

    case in this model. As noted, the OLS estimators are asymptotically normally distributed with

    that the error term has finite variance, is homoscedastic, and the mean value of the error term is

    zero. As a result, t and F tests may be valid, as long as the sample is reasonably large which in

    this study, 38 400 are the respondents.

    IV. Findings

    In the initial regressions, a marginal increase in family size increases expenditure by 660

    pesos, while in the standardized version, expenditure increases by 0.09 pesos. One reason for this

    is that when an additional family member is born, a new set of expenditures are added. A new

    family member would have to have food to eat and so, expenditure in food would have to

    increase. Also, a new member would need to have the proper education. The expenditure of the

    family would then have an increase equal to the amount. The same is true for the other

    components of a households total expenditure besides food and education.

    The variables chosen showed significance in the model used in this study. Their p-values

    were very low which implied that the alternative hypothesis that the variables total income and

  • family size have no effect on total expenditure can be rejected. With this said the null hypothesis

    that these variables, indeed, influence total expenditure can be accepted. Furthermore, it is found

    out that a total income has a greater influence on the behavior of total expenditure than that of

    the family size. A households total expenditure, on average, increases more with an increase in a

    households total income than with an increase in the households family size.

    V. Recommendation

    While this study has established a clear relationship among a households total

    expenditure, total income and family size, it has failed to show any kind of relationship between

    a familys total income and this familys size. This paper, in general, showed and discussed the

    significance of each variable and its isolated effect on total expenditure and could be improved

    by looking for any relationship between the independent variables and by studying the

    interaction of the variables. In doing so, the experimenter can further determine whether the

    variables have an additive effect as an interaction. It would be interesting to know if there are

    any additional effects in the interaction of both the family and income variables.

    Any researcher who is interested in exploring this topic further may enhance the results

    of this study by taking into consideration the effect of various policies particularly the

    Reproductive Health Bill which may have an impact on family size, and may therefore influence

    the behavior of total expenditure.

    It would also be helpful to add more variables in the model used in this study and aim to

    establish relationships between these variables and total expenditure.