Financial Valuation and Econometrics

Embed Size (px)

Citation preview

  • 8/17/2019 Financial Valuation and Econometrics

    1/480

    Kian-Guan Lim 

    Financial Valuation

    andEconometrics

    Final Draft version under Copy-Editing by

    World Scientific Publishing, June 2010

  • 8/17/2019 Financial Valuation and Econometrics

    2/480

    ii

    About the Author

    Kian-Guan Lim received his doctorate from Stanford University in1986 and works in the field of risk management and financial asset pricing. He is Professor of Quantitative Finance in the Business Schoolat the Singapore Management University and adjunct Professor in theMathematics Department of the National University of Singapore. Priorto joining SMU, Kian-Guan was at NUS and founded the University

    Center for Financial Engineering, starting the Master of Science program in Financial Engineering. He has consulted for several bankson risk validation and valuation. He was also a reservist captain in theSingapore Armed Forces and had held administrative positions at SMUand NUS including deanships and headships.

  • 8/17/2019 Financial Valuation and Econometrics

    3/480

    iii

    To Leng

  • 8/17/2019 Financial Valuation and Econometrics

    4/480

    iv

    Contents

    Preface vi-viii

    Chapter 1 Probability Distribution and Statistics 1-24

    Chapter 2 Statistical Laws and Central Limit

    Theorem

    Application: Stock Return

    Distributions

    25-40

    Chapter 3 Two-Variable Linear RegressionApplication: Financial Hedging 41-68

    Chapter 4 Model Estimation

    Application: Capital Asset Pricing

    Model

    69-84

    Chapter 5 Constrained Regression

    Application: Cost of Capital85-100

    Chapter 6 Time Series Analysis

    Application: Inflation Forecasting

    101-124

    Chapter 7 Random Walk

    Application: Market Efficiency125-144

    Chapter 8 Autoregression and Persistence:

    Application: Predictability145-154

    Chapter 9 Estimation Errors and T-Tests

    Application: Event Studies155-172

    Chapter 10 Multiple Linear Regression and

    Stochastic Regressors

    173-194

    Chapter 11 Dummy Variables and ANOVA

    Application: Time Effect Anomalies 195-206

    Chapter 12 Specification Errors 207-234

    Chapter 13 Cross-Sectional Regression

    Application: Testing CAPM235-253

    Chapter 14 More Multiple Linear RegressionsApplication: Multi-Factor Asset

    Pricing 

    254-269

  • 8/17/2019 Financial Valuation and Econometrics

    5/480

    v

    Chapter 15 Errors-in-variable

    Application: Exchange Rates and

    Risk Premium

    270-284

    Chapter 16 Unit Root ProcessesApplication: Purchasing Power Parity

    285-303

    Chapter 17 Conditional Heteroskedasticity

    Application: Risk Estimation304-324

    Chapter 18 Mean Reverting Continuous Time

    Process

    Application: Bonds and Term

    Structures 

    325-346

    Chapter 19 Implied ParametersApplication: Option Pricing

    347-359

    Chapter 20 Generalized Method of Moments

    Application: Consumption-Based

    Asset Pricing

    360-373

    Appendix A Matrix Algebra 374-394

    Appendix B Eviews Guide 395-405

    Appendix C Linear Regression in EXCEL 406-410

    Appendix D Multiple Choice Question Tests 411-435

    Appendix E Solutions to Problem Sets 436-465

    Index 466-470

  • 8/17/2019 Financial Valuation and Econometrics

    6/480

    vi

    Preface

    This book is an introduction to financial valuation and financial data analysesusing econometric methods. The complexity and enormity of global financial

    markets today has far-reaching implications for financial decision-making and practice. The uncertainty that drives and perturbs financial market prices oftendraws rigorous scientific search in the hope of finding clues to reproducewinning formulas for gold, or else to find the secret recipe to avert disasters.This scientific drive is partly fueled by the rising of mathematical analyses,including economic and statistical theory, to the occasion, and is also spurredon by the increasing availability of financial market data as well as thefullness of computing power in crunching data.

    Financial valuation is the key to investment decision and risk managementthat are central to any economy today. In a nutshell, investment leads to production of tomorrow‟s consumption goods. Risk management leads to prudence in investment and savings so as to avoid bankruptcies that costaplenty. Since the 1950‟s, the field of finance has developed a rich set oftheories and rigorous framework with which to understand how stock pricesare formed, whether rationally or sometimes perhaps behaviorally or withanomalies. Besides stock prices and returns, bond prices, interest rates,exchange rates, futures and option prices are other major financial variables in

    the capital markets.The empirical validation of financial valuation models and of market

     phenomena by data, and in turn the feedback to appropriate and effectivetheoretical modeling, form an interesting and exciting experience in the studyof finance. There are really three key domain knowledge areas here. Financialvaluation or pricing theories are typically constructed from more fundamentaleconomic axioms such as investor rationality and insatiability. Somemathematical tools such as optimization and conditional expectations are

    utilized. In the process of deriving a closed form or else analytical formtheoretical model, market equilibrium conditions are often added as part of thenecessary conditions to a solution. A major output of such theorizing efforts isan asset pricing model. Theoretical models help to explain positively howmarket variables happened and provide a vehicle to develop optimal decision-making.

    Yet pragmatic investment decisions typically require parameter inputs thathave to be estimated or require forecasts of future prices. Such considerationsinevitably lead to applications of statistical models to historical prices and

    time series of the relevant economic variables. In more formal language, thisis the construction of a probability space on these variables of interest. If wemodel a particular variable over time, it is a statistical model. If we model acollection of variables over time, where these variables mutually influence

  • 8/17/2019 Financial Valuation and Econometrics

    7/480

    vii

    each other, then it is also called an econometric model. Thus the second keydomain knowledge area is probability and statistical theory, or elseeconometrics in the context of problems to do with capital market finance.

    Finally, another key domain is how data are collected and used. Raw dataare just numbers that in themselves do not lend much insight. For example, ifwe collect twelve past monthly return rates of a particular stock, and find thatthe sample average of these is one percent, this average of one percent shouldnot be used simply as an expectation of what the next month‟s return would

     be. But suppose we have in addition a statistical model showing that themonthly return rate of this stock follows an upward trend of half percent whileany deviation is due to random error, then it is more accurate to expect nextmonth‟s return to be half percent. Sometimes great attention has to be paid to

    whether monthly return rate, or daily return rate, or intra-day return rate, ismore appropriate for the question under study. It is of paramount importanceto understand how the data are obtained, whether there are observational orrecording errors, and whether there are better proxy variables to represent theeffect we seek.

    This book is a modest attempt to bring together these domains in financialvaluation theory, in econometrics modeling, and in the empirical analyses offinancial data. These domains are highly inter-twined and should be properlyunderstood in order to correctly and effectively harness the power of data andstatistical or econometrics methods for investment and financial decision-making.

    One can think of many good books in basic econometrics and also manygood books in finance theory and modeling. The contribution in this book, andat the same time, its novelty, is in employing materials in basic econometrics,

     particularly linear regression analyses, and weaving into it threads offoundational finance theory, concepts, ideas, and models. The treatment in this

     book is at a basic level. It is hoped that advanced undergraduate or first year

     postgraduate students learning finance and/or basic econometrics or linearregression analyses could go through the materials in this book with aheightened appreciation of how applied econometrics inter-twined with thediscovery of financial market knowledge.

    It is also hoped that all students who work through this book will begin tounderstand that it may be indeed erroneous to make a forecast by simplytaking a bunch of financial time series data and making a straight-line leastsquares regression. We should seek to know what the theory, if any, behindthe linkage of the variables is, and be able to choose the appropriate timeseries data, and employ useful econometric modeling to address not soapparent features of the time series such as non-homogeneity, non-linearity,measurement errors, and so on. Students should also appreciate that theestimates or test statistics are in themselves random variables and would

  • 8/17/2019 Financial Valuation and Econometrics

    8/480

    viii

     behave in some prescribed manner as sample size varies, and would alsomisbehave if there is a spurious problem, and thus be able to interpretempirical results with more scientific precision.

    The chapters of the book are organized along a general progression oftopics taught in basic econometrics, particularly in linear regression analyses,although there is also coverage on time series analyses and on the nonlineargeneralized method-of-moments technique. There is a clear attempt on my

     part to make this coincide with teaching of key concepts and theories infinancial valuation. In fact, this feature of covering both finance andeconometrics at the same time should be especially rewarding and interestingto students who are learning both finance and basic econometrics at the sametime.

    At the beginning of each chapter in this book, key points of learning   arelisted so that students can check their own progress if they have covered themajor materials of the chapter. Some econometrics materials, especially thoseinvolving multiple variables, are more conveniently developed in terms ofmatrices instead of arithmetic algebra. Therefore, some prior knowledge ofmatrix algebra will be helpful. Appendix A contains a short refresher on matrixalgebra as a preparation.

    Most chapters would contain one or more finance application exampleswhere finance concepts, and sometimes theory, are taught. I have tried toincorporate real examples of companies and practice where useful, and due tomy nationality, I naturally use some examples of Singapore-based companies.References to articles and sources are usually listed as  footnotes on the same

     pages. Data sources for the empirical examples are cited. The empiricalexamples were developed using EVIEWS, a statistical software package thatis easily available. Alternative software such as R or SAS, or even EXCEL(with VBA) can be used as well. A beginner‟s guide to using EVIEWS andalso EXCEL regression is provided in  Appendix B and C   respectively. Each

    chapter ends with a problem set for the student to practice, and more readingreferences should the student desire to learn more advanced materials relatedto the contents of that chapter.  Appendix D  contains sets of multiple choicequestion tests so students can quickly check if they understand the conceptstaught. Appendix E provides solutions to the problem sets.

    This manuscript is a substantial expansion and revision from a draftversion that I used to teach in a course on Investment and Financial DataAnalysis. I wish to express my thanks to Dharma, Hong Chao, Jane Lim, YiBao, Christopher Ting, and several other colleagues who had provided

    valuable feedbacks. Finally, any updates or errata will be available at http://www. mysmu.edu/faculty/kglim. 

     Kian Guan

    Singapore, April 2010 

  • 8/17/2019 Financial Valuation and Econometrics

    9/480

    ix

  • 8/17/2019 Financial Valuation and Econometrics

    10/480

  • 8/17/2019 Financial Valuation and Econometrics

    11/480

    1

    Chapter 1

    PROBABILITY DISTRIBUTION AND STATISTICS

    Key Points of Learning

    Random variable, Joint probability distribution, Marginal probabilitydistribution, Conditional probability distribution, Expected value, Variance,Covariance, Correlation, Independence, Normal distribution function, Chi-square distribution, Student-t distribution, F-distribution, Data types andcategories, Sampling distribution, Hypothesis, Statistical test

    1.1 PROBABILITY

    Joint probability, marginal probability, and conditional probability areimportant basic tools in financial valuation and regression analyses. Theseconcepts and their usefulness in financial data analyses will become clearer atthe end of the chapter. To motivate the idea of a joint probability distribution,let us begin by looking at a time series plot or graph of two financial economic

    variables over time: Xt and Yt, for example, S&P 500 Index aggregate price-to-earnings ratio Xt, and S&P 500 Index return rate Yt. The values or numbersthat variables Xt and Yt will take are uncertain before they happen, i.e. beforetime t. At time t, both economic variables take realized values or numbers x t and yt. xt and yt are said to be realized jointly or simultaneously at the sametime t. Thus we can describe their values as a joint pair (xt,yt). If their order is

     preserved, it is called an ordered pair. Note that subscript t represents the timeindex.

    The P/E or price-to-earnings ratio of a stock or a portfolio is a financialratio showing the price paid for the stock relative to the annual net income or profit per share earned by the firm for the year. The reciprocal of the P/E ratiois called the earnings yield. The earnings yield or E/P reflects the risky annualrate of return, R, on the stock. This is easily shown by the relationship $E = $P× R%. In other words, P/E = 1/R.

    In Figure 1.1, it seems that low return corresponded to, or lagged high P/Eespecially at the beginnings of the years 1929-1930, 1999-2002, and 2008-2009. Conversely, high returns followed relatively low P/E ratios at the

     beginning of years 1949-1954, 1975-1982, and 2006-2007. We shall explorethe issue of the predictability of stock return in a later chapter.

    The idea that random variables correspond with each other over time orthat display some form of association is called a statistical correlation which is

  • 8/17/2019 Financial Valuation and Econometrics

    12/480

    2

    defined, or which has interpretative meaning only when there is existence of a joint probability distribution describing the random variables.

    Figure 1.1

    S&P 500 Index Portfolio Return Rate and Price-Earning Ratio 1872-2009

    (Data from Prof Shiller, Yale University)

    -60%

    -40%

    -20%

    0%

    20%

    40%

    60%

    1870 1890 1910 1930 1950 1970 1990 2010

    YEAR

    S&P 500 INDEX RETURN RATES&P 500 INDEX AGGREGATE P/E RATIO

     

    In Figure 1.2, we plot the U.S. national aggregate consumption versus nationaldisposable income in US$ billion. Disposable income is defined as PersonalIncome less personal taxes. Personal Income is National Income less corporatetaxes and corporate retained earnings. In turn, National Income is GrossDomestic Product (GDP) less depreciation and indirect business taxes such assales tax. GDP is essentially the total dollar output or gross income of thecountry. If we include repatriations from citizens working abroad, then it

     becomes Gross National Product (GNP).In Figure 1.2, it appears that consumption increases in disposable income.

    The relationship is approximately linear. This is intuitive as on a per capita basis, we would expect that for each person, when his or her disposable

    income rises, he or she would consume more. In life-cycle models of financialeconomics theory, some types of individual preferences could lead toconsumption as an increasing function of individual wealth which consists ofinheritance as well as fresh income. Sometimes analysis on income also

  • 8/17/2019 Financial Valuation and Econometrics

    13/480

    3

     breaks it down into a permanent part and a transitory part. More of these could be read in economics articles on life-cycle models and hypotheses.

    Figure 1.2

    U.S. Annual National Aggregate Consumption versus Disposable Income

    1999-2009 (Data from Federal Reserve Board of U.S. in $billion)

    $7,200

    $7,600

    $8,000

    $8,400

    $8,800

    $9,200

    $9,600

    $7,000 $8,000 $9,000 $10,000 $11,000

    DISPOSABLE INCOME

       C   O   N   S   U   M   P   T   I   O   N

     

    In Figure 1.3 we evaluate the annual year-to-year change in consumption and

    disposable income and plot them on an X-Y graph. The point P1 refers to the bivariate values (x1,y1) where x1  is change in disposable income and y1  ischange in consumption in 2000. P2 refers to the bivariate values (x2,y2) wherex2  is change in disposable income and y2 is change in consumption in 2001,and so on. Subscripts to x and y indicate time. It may be construed as the endof a time period and the beginning of the next time period. In this case,subscript 1 refers to time t1, end of year 2000.

    The pattern in Figure 1.3 reveals that disposable income change droppedfrom t=1 to t=2, then rose back at t=3. After that there was a sharp drop at t=4

     before a wild swing back up at t=5, and so on. The changes seem to becyclical. A cyclical but decreasing trend can be seen in consumption.However, what is more interesting is that consumption and disposable incomevisibly increased and decreased together. Thus, if we construe consumption as

  • 8/17/2019 Financial Valuation and Econometrics

    14/480

    4

     purchases of goods and services, then the plot displays the positive incomeeffect on such effective demand. Theoretically, each Xt and each Yt for everytime t is a random variable.

    Figure 1.3

    U.S. Annual Year-to-Year Change in National Aggregate Consumption

    versus Change in Disposable Income 2000-2009 (Data from Federal

    Reserve Board of U.S. in $billion)

    $-100

    $0

    $100

    $200

    $300

    $400

    $40 $80 $120 $160 $200 $240 $280 $320 $36

    CHANGE IN DISPOSABLE INCOME

       C   H   A   N

       G   E   I   N

       C   O   N   S   U   M   P   T   I   O   N

    P1

    P5

    P3

    P8

    P7

    P2

    P9

    P4

    P10

    P6

     

    A random variable is a variable that takes on different values each with agiven probability. It is a variable with an associated probability distribution.For the above scatter plot, since Xt and Yt occur jointly together in (Xt,Yt), the

     pair is a bivariate random variable, and thus has a joint bivariate probabilitydistribution. There are two generic classes of probability distributions: discrete

     probability distribution where the random variable takes on only a finite set of possible values, and continuous probability distribution where the randomvariable takes on an uncountable number of possible values. In what follows,

    we construct a bivariate discrete probability distribution of the return rates ontwo stocks.

    Let t denote the day number. Thus, time t=1 is end of day 1, and t=2 is endof day 2, and so on. Let Pt be the price in $ of stock ABC at time t. Let X t+1 be

  • 8/17/2019 Financial Valuation and Econometrics

    15/480

    5

    stock ABC‟s holding or discrete return rate at time t+1. Xt+1 = Pt+1/Pt  –  1. Thecorresponding continuously compounded return rate at t+1 is ln(Pt+1/Pt), whichis approximately Xt+1 when Xt+1 is close to 0. Another stock XYZ has discretereturn rate Yt+1 at time t+1.

    Table 1.1

    Discrete Bivariate Joint Probability of Two Stock Return Rates

    Xt+1 

    P(xt+1,yt+1) a1 a2 a3 a4 a5 a6 P(yt+1)

     b1 0.005 0.03 0.03 0.015 0.005 0.01 0.095

     b2 0.015 0.02 0.04 0.015 0.005 0.02 0.115

    Yt+1  b3 0.015 0.025 0.05 0.02 0.015 0.05 0.175

     b4 0.03 0.03 0.07 0.08 0.025 0.035 0.27

     b5 0.02 0.06 0.04 0.05 0.045 0.02 0.235

     b6 0.015 0.035 0.02 0.02 0.005 0.015 0.11

    P(xt+1) 0.1 0.2 0.25 0.2 0.1 0.15 1

    In Table 1.1, we must take care to distinguish between random variable Xt+1 and the realized value it takes in an outcome, e.g. x t+1   a3. For example, a3 could be 0.08 or 8%. In the bivariate discrete probability distribution shown inthe table, Xt+1 takes one of six possible values viz. a1, a2, a3, a4, a5, and a6. The

     probability of any one of these six events or outcomes is given by P(Xt+1 = xt+1  ak ), or in short P(xt+1), and is shown in the last row of the table. The

     probability function P(.) for discrete probability distribution is also called a probability mass function (pmf). We should think of a probability or chance as

    a one-to-one function that maps or assigns a number in [0,1]   to eachrealized value of the random variable. denotes the real line or (-, +).Likewise, the probability of any one of the six outcomes of random variableYt+1 is given by P(yt+1) and is shown in the last column of the table. Note thatthe probabilities of events that make up all the possibilities must sum up to 1.

    The joint probability of event or outcome with realized values (xt+1 , yt+1) isgiven by P(Xt+1=xt+1,Yt+1=yt+1). These probabilities are shown in the cellswithin the inner box. For example, P(a3, b5) = 0.04. This means that the

     probability or chance of Xt+1 = a3  and Yt+1 = b5  simultaneously occurring is

    0.04 or 4%. Clearly the sum of all the joint probabilities within the inner boxmust equal 1. The marginal probability of Yt+1  = b3  in the context of the(bivariate) joint probability distribution is the probability that Yt+1  takes the

    realized value yt+1  b3 regardless of the simultaneous value of xt+1. We write

  • 8/17/2019 Financial Valuation and Econometrics

    16/480

    6

    this marginal probability as PY(Yt+1=b3). The subscript Y to probabilityfunction P(.) is to highlight that it is marginal probability of Y. Sometimes thisis omitted. Note that this marginal probability is also a univariate probability.In this case, PY(b3) = P(a1,b3) + P(a2,b3) + P(a3,b3) + P(a4,b3) + P(a5,b3) +P(a6,b3). Notice we simplify the notations indicating the a j‟s and bk ‟s arevalues xt+1  and yt+1  respectively where the context is understood. In a fullsummation notation,

     

      6

    1 j

    31t j1t31tY   bY,aXP bYP .

    This is obviously the sum of numbers in the row involving b3, and is equal to0.175. The marginal probability of Xt+1 = a2 is given by

      0.2 bY,aXPaXP6

    1k 

    k 1t21t21tX    

    .

    Thus, given the joint probability distribution, the marginal probabilitydistribution of any one of the joint random variables can be found.

    What is

      6

    1 j

    6

    1k 

    k 1t j1t   bY,aXP  ? Employing the concept of marginal

     probability we just learned,

      .1aXP bY,aXP6

    1 j

     j1tX

    6

    1 j

    6

    1k 

    k 1t j1t      

    In the bivariate probability case, we know that future risk or uncertainty ischaracterized by one and only one of the 36 pairs of values (a j, bk ) that willoccur. Suppose the event has occurred, and we know only that it is event{Xt+1=a2} that occurred, but without knowing which of the events b1, b2, b3, b4,

     b5, or b6 had occurred in simultaneity. An interesting question is to ask what isthe probability that {Yt+1=b3} had occurred, given that we know {Xt+1=a2}

    occurred. This is called a conditional probability, and is denoted by P(Yt+1=b3| Xt+1=a2). The symbol “|” represents “given” or “conditional on”. 

    From Table 1.1, we focus on the column where it is given that {xt+1a2}occurred. This is shown below as Table 1.2. The highlighted 0.025 is the joint

     probability of (a2 ,b3).Intuitively, the higher (lower) this number, the higher (lower) is the

    conditional probability that b3 in fact had occurred simultaneously. Given that

    a2 had occurred, we are finding the conditional probabilities given {xt+1a2},which is in itself a proper probability distribution and thus must have

     probabilities that add to 1. Then the conditional probability must be therelative size of 0.025 to the other joint probabilities in the above column.

  • 8/17/2019 Financial Valuation and Econometrics

    17/480

    7

    Table 1.2

    Joint Probability of Two Stock Return Rates when Xt+1=a2

    0.030.02

    0.025

    0.03

    0.06

    0.035

    We recall Bayes‟ rule on event sets, that 

     

    BPBAP

    B|AP   

    where A and B are events or event sets in a universe. We can think of theoutcome {Xt+1 = a2} as event B, and outcome {Yt+1 = b3} as event A. Eventscan be more general, as occurrences {Xt+1=a j}, {Yt+1=bk }, {Xt+1=a j  ,Yt+1=bk }are all events or event sets. More exactly,

        0.1250.20.025

    aP

     b,aPa| bP

    2X

    3223   .

    In general,

    .

     bY,aXP

     bY,aXP

    aXP

     bY,aXPaX| bYP

    6

    1k k 1t j1t

    k 1t j1t

     j1tX

    k 1t j1t

     j1t1t

      k 

     

    When we move from discrete probability distribution, where event setsconsist of discrete elements, to continuous probability distribution,  whereevent sets are continuous, such as intervals on a real line, we have to deal withcontinuous functions.

    The continuous joint probability density function (pdf) of bivariate (Xt+1,Yt+1) is represented by a continuous function f(x,y) where Xt+1 = x, and Yt+1 =y, and x, y are usually numbers on the real line . Note that we simplify the

    notations of the realized values by dropping their time subscripts here. For acontinuous probability distribution, the events are described not as pointvalues e.g. x=3, y=4, but rather as intervals, e.g. event A = {(x,y): -2 < y < 3},B = {(x,y): 0 < x < 9.5}. Then,

  • 8/17/2019 Financial Valuation and Econometrics

    18/480

    8

    P(A,B) = P(0 < x < 9.5, -2 < y < 3) =

    5.9

    0

    3

    2

    dxdyyx,f  .

    The “support” for a random variable such as Xt+1

     is the range of x. For joint

    normal densities, the ranges are usually (-,). Thus, Yt+1 also has the samesupport. It is usually harmless to use (-,) as supports even if the range isfinite [a,b], since probabilities of null events (-,a) and (b,) are zeros.However, when more advanced mathematics is involved, it is typically betterto be precise. Notice also that probability is essentially an integral of afunction, whether continuous or discrete, and is area under the pdf curve.

    The marginal probability density function of Xt+1 and Yt+1 are given by

      dxyx,f yf Y  

    and

      dyyx,f xf X .

     Notice that while f(x,y) is a function containing both x and y, f Y(y) is afunction containing only y since x is integrated out. Likewise f X(x) is afunction that contains only x.

    The conditional probability density functions are:

    f(x|y) = f(x,y)/f Y(y)

    and f(y|x) = f(x,y)/f X(x) .

    These conditional pdf‟s contain both x and y in their arguments. 

    1.2 EXPECTATIONS

    The expected value of random variable Xt+1 is given by

    E(Xt+1) =

    6

    1 j

    X jX j   μaPa for the discrete distribution in Table 1.1,

    and for continuous pdf,

    E(Xt+1) =   XX   μdxxf x  

    .

    The conditional expected value or conditional expectation of Xt+1

    |b4  is given

     by

    E(Xt+1|b4) =

    6

    1 j

    4 j j   b|aPa  for the discrete distribution in Table 1.1,

  • 8/17/2019 Financial Valuation and Econometrics

    19/480

    9

    and for continuous pdf,

    E(Xt+1|y) = dxy|xf x

    .

     Notice that for the continuous pdf, the conditional expected value given y is afunction containing only y. This means that one can further evaluate morespecific conditional expectations based on given sets of y values e.g. {y: -2 <y < 3}. Then E(Xt+1|-2 < y < 3) is found via

     

    3

    2

    Y

    3

    2

    3

    2

    3

    2

    3

    2

    dyyf 

    dydxyx,f x

    dydxyx,f 

    dxdyyx,f x

    dx

    dydxyx,f 

    3yx,-2f xdx3y-2|xf x

     

    The interchange of integrals in the last step above uses the Fubini Theoremassuming some mild regularity conditions satisfied by the functions.

    The variance of a continuous random variable Xt+1 is given by

    var(Xt+1) = X2 =

      dx(x)f μx X2

    x .

    Variance measures the degree of movement or variability of the randomvariable itself. The standard deviation of a random variable Xt+1 is the squareroot of the variance. Standard deviation (s.d.) is sometimes referred to asvolatility and sometimes as “risk ” in the finance literature.

    The covariance between two continuous random variables Xt+1 and Yt+1 isgiven by

    cov(Xt+1,Yt+1) = XY =  

      dydxyx,f μyμx yx .Covariance measures the degree of co-movements between two randomvariables. If the two random variables tend to move together, i.e. when oneincreases (decreases), the probability of the other increasing (decreasing) is

  • 8/17/2019 Financial Valuation and Econometrics

    20/480

    10

    high, then the covariance will be a positive number. If they vary inversely,then the covariance will be a negative number. If there is no co-movingrelationship and each random variable moves independently, then theircovariance is zero. Notice that covariance is also an expectation or integral. 

    The co-movement of two random variables is typically better characterized by their correlation coefficient which is covariance normalized or divided bytheir s.d.‟s. 

    corr(Xt+1,Yt+1) =YX

    XYXY

    ζζ

    ζρ   .

    One other advantage of using the correlation coefficient than the variance isthat the correlation coefficient is not denominated in the value units of X or Y

     but is a ratio.It is important to understand that correlation measures association but notcausality. In Figure 1.3, clearly changes in consumption and income arestrongly positively correlated. Suppose one concludes that increasingconsumption will increase income, the resulting action will be disastrous. Oreven if one simply concludes (based on some understanding ofmacroeconomics theory or by intuition) that increased income causesincreased consumption, it may still be premature as there are so many other

     possibilities and qualifications. For example, some other unobserved variables

    such as general education level could lead to increases in both income andconsumption.

    Or, suppose we think of Yt+1 as GDP and Xt+1 as population. Both increasewith time due to various economic and geo-political reasons. But it will bedisastrous for policy implication to think that increasing population leads to orcauses increase in GDP. This has to assume fairly constant employment andoutput.

    For general random variables X and Y, (dropping time subscripts), we can

    write their means, variances, and covariance as follows.

    E(X) = X E(Y) = Yvar(X) = E(X-X)2 = E(X2) -

    2

    X  

    var(Y) = E(Y-Y)2 = E(Y2) -2

    Y  cov(X,Y) =   YXYX   μμXYEμYμXE   .

    Covariances are actually linear operators. A function is   B A f     :   or

     Bb Aaba f   f       ,;:   in which A is the domain set and B the rangeset and each a is mapped onto one and only one element b in B. We can think

  • 8/17/2019 Financial Valuation and Econometrics

    21/480

    11

    of an operator as a special case of a function where the domain and rangeconsist of normed space such as a vector space. These technicalities are notimportant except in more advanced courses.

     Now consider N number of random variables Xi, where i=1,2,….,N. Avery useful property of a covariance is shown below.

     

      

     

     N

    1 j

     j

     N

    1i

    i   X,Xcov  

     

     

     N

    1 j

     j j

     N

    1i

    ii   XEXXEXE  

       j j N

    1i

     N

    1 j

    ii   XEXXEXE      

     

     N

    1i

     N

    1 j

     j jii   XEXXEXE  

     N

    1i

     N

    1 j

     ji   X,Xcov .

    A special case of the above is

    .YX,2covYvar Xvar 

    YY,covXY,covYX,covXX,cov

    YXY,XcovYXvar 

     

    A convenient property of a correlation coefficient     is that it lies between – 1

    and +1. This is shown as follows. For any real ,

      0YζXζρ2θ2Yζ2θ2XζθYXvar    .

    PutY

    X

    ζ

    ζρθ  . Then, 02

    Xζ2ρ22

    Xζ2ρ2

    Xζ   .

    Thus, for any random variable X and Y,   0ρ1ζ   22X   , and hence   01   2    , or 12    . Therefore, 11       .

    1.3 DISTRIBUTIONSContinuous probability distributions are commonly employed in regressionanalyses. The most common probability distribution is the normal (Gaussian)distribution. The pdf of a normally distributed random variable X is given by

  • 8/17/2019 Financial Valuation and Econometrics

    22/480

    12

    2

    ζ

    μx

    2

    1

    2e

    2ππ

    1f(x)

     

      

       

      for - < x <  

    where the mean of x is , and the s.d. of x is .  and  are given constants.

    E(X) =

     μxf(x)dx  

    Var(X) = E(X-)2

    =

      f(x)dxμ)(x   2  

    = 2

    The cumulative distribution function (cdf) of X is

    F(X) =

    x

    f(x)dx .

    We can write the distribution of X as 2d

    ζμ, N~X   in which the arguments

    indicate the mean and variance of the normal random variable. Suppose we

    define a corresponding random variable

    ζ

    μXZ

    Δ   or ZζμX    

    where the symbol “

    ” means “to define”. The second equality is interpretedas not just equivalence in distribution, but that whenever Z takes value z, then

    X takes value x =  + z. Then,

    E(Z) = 0, and Var(Z) = 1.

    Since a constant multiple of a normal random variable is normally distributed,and a sum of normal random variables is also a normal random variable,

    then   0,1 N~Zd

    . Z has pdf  

      

        ζ

    μxf  , and is called the standard normal

    variable.

    For normal distribution N(, 2),

    F(X) =

     

      

        ζμx

    dzζ

    μxf   

  • 8/17/2019 Financial Valuation and Econometrics

    23/480

    13

    where  

      

        ζ

    μxf  is the standard normal  pdf, and z =

    ζ

    μx . The standard

    normal cdf is often written as ( z ). For the standard normal Z,P(a z  b) = (b) –  (a).

    The normal distribution is a familiar workhorse in statistical estimation andtesting. The normal distribution pdf curve is “bell-shaped”. Areas under thecurve are associated with probabilities. The following Figure 1.4 shows astandard normal pdf N(0,1) and the associated probabilities.

    Figure 1.4

    Standard Normal Probability Density Function of Z

    The corresponding z values of random variable (r.v.) Z can be seen in thefollowing standard normal distribution Table 1.3.

    For example, the probability P(- < Z < 1.5) = 0.933. This same probabilitycan be written as P(-  Z < 1.5) = 0.933, P(- < Z  1.5) = 0.933, or P(-  Z  1.5) = 0.933. This is because for continuous pdf, P(Z = 1.5) = 0.

    From the symmetry of the normal pdf, P(-a < Z < ) = P(- < Z < a), wecan also compute the following.

    P(Z > 1.5) = 1 - P(- < Z  1.5) = 1 –  0.933 = 0.067.P(- < Z  -1.0) = P(Z > 1.0) = 1 - P(- < Z  1.0) = 1 –  0.841 = 0.159.

    P(-1.0 < Z < 1.5) = P(- < Z < 1.5) - (- < Z  -1.0) = 0.933 –  0.159 = 0.774.P(Z  -1.0 or Z  1.5) = 1 - P(-1.0 < Z < 1.5) = 1 –  0.774 = 0.226.

    a= -1.645 0

    Total Area from -  to  

    5%

  • 8/17/2019 Financial Valuation and Econometrics

    24/480

    14

    Table 1.3

    Z Area under curve

    from - to a

    Z Area under curve

    from - to a0.000 0.500 1.600 0.945

    0.100 0.539 1.645 0.950

    0.200 0.579 1.700 0.955

    0.300 0.618 1.800 0.964

    0.400 0.655 1.960 0.975

    0.500 0.691 2.000 0.977

    0.600 0.726 2.100 0.982

    0.700 0.758 2.200 0.986

    0.800 0.788 2.300 0.989

    0.900 0.816 2.330 0.990

    1.000 0.841 2.400 0.992

    1.100 0.864 2.500 0.994

    1.282 0.900 2.576 0.995

    1.300 0.903 2.600 0.996

    1.400 0.919 2.700 0.997

    1.500 0.933 2.800 0.998

    Several values of Z under N(0,1) are commonly encountered, viz. 1.282,1.645, 1.960, 2.330, and 2.576.

    P(Z > 1.282) = 0.10 or 10% .P(Z < -1.645 or Z > 1.645) = 0.10 or 10% .P(Z > 1.960) = 0.025 or 2.5% .P(Z < -1.960 or Z > 1.960) = 0.05 or 5% .P(Z > 2.330) = 0.01 or 1% .P(Z < -2.576 or Z > 2.576) = 0.01 or 1% .

    The case for P(Z

  • 8/17/2019 Financial Valuation and Econometrics

    25/480

    15

     

      

       

     

      

       

     

      

       

     

      

       

    2

    Y

    Y

    Y

    Y

    X

    X

    2

    X

    X

    2 ζ

    μy

    ζ

    μy

    ζ

    μx2ρ

    ζ

    μx

    ρ1

    1q  

    and ρζζ

    y)cov(x,

    YX

    .

    The multivariate normal distribution pdf (p-variate normal pdf) is given by

     

      

        μxΣμx2

    1exp

    Σ2π

    1x,,x,xf    1

    T

    1/2 p/2 p21   

    where x is the vector of random variables X1  to X p  ,   is the px1 vector ofmeans of x, and  is the p×p covariance matrix of x. If p=2 is substituted intothe above, the bivariate pdf shown in equation (1.1) can be obtained.

    The k th moment of random variable X is   dxxf x k   where f(x) is the pdfof X. If   = E(X) is the mean of X, the k th central moment of X is

        dxxf μx  k 

    . Notice that the variance is the second central moment of

    X. The 3rd central moment  variance3/2 is known as skewness. The 4th centralmoment  variance2 is known as kurtosis.

    The normal distribution r.v. X   N(,2) has a mean , variance 2,skewness 0, and kurtosis 32. Hence the standard normal variate Z  N(0,1)has a mean 0, variance 1, skewness 0, and kurtosis 3.

    Many financial variables, e.g. daily stock returns, currency rate of change,etc. display skewness as well as large kurtosis compared with the benchmarknormal distribution with symmetrical pdf, skewness = 0, and kurtosis = 3.

    Figure 1.5

    Example of a Pdf with Negative Skewness and Large Kurtosis

      0

    f(x)

    negative or leftskewness(longer left tail)

    x

    fat tails withkurtosis > 3

  • 8/17/2019 Financial Valuation and Econometrics

    26/480

    16

    Departure from normality is illustrated by a pdf in Figure 1.5. The shadedarea in Figure 1.5 shows a normal pdf. The unshaded curve shows pdf of arandom variable with negative skewness and a kurtosis larger than that of thenormal random variable.

    The concept of stochastic independence between random variables isimportant. Two random variables X and Y are said to be stochasticallyindependent if and only if their joint pdf‟s can be expressed as follows:

    f(X,Y) = f x(X) f y(Y).

    One implication of the above is that for any function h(.) of X and anyfunction g(.) of Y, their expectation can be found as

    E(h(X) g(Y)) = E(h(X)) E(g(Y)).A special case is the covariance operator. If X and Y are (stochastically)

    independent, then it implies their covariance is zero:

    cov(X,Y) =   0μYEμXEμYμXE YXYX   .The converse is not always true. It is true only for special cases such as whenX and Y are jointly normally distributed. When X and Y are jointly normallydistributed, then if they have zero covariance, they are stochasticallyindependent. For bivariate normal pdf, conditional

    (y)f y)f(x,y)|g(x

    Y

    .

    Or,

    2

    Y

    Y

    ζ

    μy

    2

    1

    Y

    2

    q

    2

    YX

    e

    2πζ

    1

    eρ1ζ2π

    1

    y)|g(x

     

      

       

        

    2

    Y

    Y

    X

    X

    2 ζ

    μyρ

    ζ

    μx

    ρ-12

    1

    2

    X

    eρ12πζ

    1  

     

      

       

     

      

       

     

      2

    Y

    YXX22 ζ

    μyρμx

    ρ-12

    1

    22

    X

    eρ12π

    1  

     

      

      

      Y  X     

     

     

      

    2Y|X2Y|X

    μX2ζ

    1

    2

    Y|X

    e2ππ

    1    

  • 8/17/2019 Financial Valuation and Econometrics

    27/480

    17

    where2

    x

    22

    Y|X   )ζρ(1ζ   is the variance of X conditional on Y=y,

    and YY

    XXY|X   μy

    ζ

    ζρμμ    is the mean of X condition on Y=y.

    There are some common continuous probability distributions that are

    related to the normal distribution. If random variable )ζ, N(~X   2d

      , then

    random variable2

    1

    2

    χ ~ζ

    μ-XV  

     

      

        is a chi-square distribution with 1

    degree of freedom. If X1, X2, X3, …, Xn  are n random variables each

    independently drawn from the same population distribution )ζ N(μ(  2

    , or

    think of {Xi}i=1 to n as a random sample of size n, then2

    n

    2n

    1i

    i χ ~ζ

    μX

     

      

         is

    a chi-square distribution with n degrees of freedom.

    If  N(0,1)~Xd

    , and2

    d

    χ ~V , and both X, V are stochastically independent,

    then1

    Vr 

    X

    is a Student-t distribution with r degrees of freedom. If

    2

    d

    1χ ~U ,

    2

    d

    2χ ~V , and both U, V are stochastically independent, then

    21 r ,r 

    d

    1-

    2

    -1

    1 F~Vr 

    Ur is an

    F-distribution with degrees of freedom r 1  and r 2. If random variable

    2d

    ζμ, N~X , and Y = exp(X) or X = ln(Y), then Y is a random variable with

    a lognormal distribution.

    1.4 STATISTICAL ESTIMATION

    Suppose a random variable X with a fixed normal distribution N(, 2) isgiven. Suppose there is a random draw of a number or outcome from thisdistribution. This is the same as stating that random variable X takes a realizedvalue x. Let this value be x1; it may be say 3.89703. Suppose we repeatedlymake random draws and thus form a sample of n observations: x1, x2, x3, ….,xn-1, xn. This is called a random sample with a sample size of n. Each x i comes

    from the same distribution N(, 2), but each of xi, x j  are realizations from

    independent sampling.We next compute a statistic, which is a function of the realized values

    {xk }, k=1,2,…..,n. Consider a statistic, the sample mean

  • 8/17/2019 Financial Valuation and Econometrics

    28/480

    18

    n

    1k 

    k xn

    1x . Another common sample statistics is the unbiased sample

    variance

    n

    1k 

    2

    2 xx1n

    1s .

    Each time we select a random sample of size n, we obtain a realization x .Thus, x   is itself a realization of a random variable, and this r.v. can bedenoted by

    n

    1k 

    k n   Xn

    1X  

    where Xk  above is clearly the random variable from N(, 2) itself. nX is arandom variable and its probability distribution is called the samplingdistribution of the mean, or perhaps more clearly, the distribution of thesample mean.

    What is the exact probability distribution of nX ?

    n

    1k 

    n

    1k 

    n

    1k 

    k n   μμn

    1XE

    n

    1XE

    n

    1)XE( .

    n

    ζ

    n

    nζXvar 

    n

    1Xvar 

    n

    1)Xvar(

    2

    2

    2n

    1k 

    k 2

    n

    1k 

    k 2n   

    .

    Since nX  is a normal random variable, therefore

    nX    

      

     

    n

    ζμ, N

    2

    .

    The standardized normal random variable then becomes

    ζ

    μXn

    n

    ζ

    μX n2

    n     10, N .

    On the other hand, E(s2) = 2. But s2 itself has a sampling distribution.

      2 1nd

    2

    2

    χ ~ζ

    s1n   . Thus it can be seen that E(n-1

    2) = n-1, the number of

    degrees of freedom of the chi-square random variable. Therefore,

    s

    μXn

    ζ

    s

    ζ

    μXnn

    2

    2

    n

     

  • 8/17/2019 Financial Valuation and Econometrics

    29/480

    19

    is distributed as Student-t with (n-1) degrees of freedom and zero mean.Denote the random variable with t-distribution, n-1 degrees of freedom, as tn-1.Then,

    1nd

    n t~s

    μXn .

    Suppose we find (-a,+a), a>0, such that Prob(-a  tn-1  +a) = 95%. Since tn-1 issymmetrically distributed, then Prob(-a   tn-1) = 97.5% and Prob(tn-1  +a) =97.5%. Thus,

    0.95a

    s

    μXnaProb   n

     

      

     

    .

    Also, 0.95n

    saXμn

    saXProb nn     

       .

    Suppose x1, x2, x3, …., xn-1, xn are randomly sampled from )ζ N(μ(~X  2

    d

    .

    Sample size n = 30. The t-statistic value such that Prob(t29  2.045) = 97.5% ist29 = 2.045. Then

    0.95

    30

    s2.045Xμ

    30

    s2.045XProb nn  

     

     

     

      .

    Hence the 95% confidence interval  estimate of   is given by

     

      

     

    30

    s2.045X,

    30

    s2.045X nn  when estimated s is entered.

    1.5 STATISTICAL TESTING

    In many situations there is apriori (or ex-ante) information about the value of

    the mean , and it may be desirable to use observed data to test if theinformation is correct.   is called a parameter of the population or fixeddistribution N(, 2). A statistical hypothesis is an assertion about the truevalue of the population parameter, in this case . A simple hypothesisspecifies a single value for the parameter while a composite hypothesis willspecify more than one value. We will work with the simple null hypothesis H0 (sometimes this is called the maintained hypothesis), which is what is

     postulated to be true. The alternative hypothesis HA is what will be the case if

    the null hypothesis is rejected. Together the values specified under H0 and HA should form the total universe of possibilities of the parameter. For example,

    H0:  = 1HA:   1.

  • 8/17/2019 Financial Valuation and Econometrics

    30/480

    20

    A statistical test of the hypothesis is a decision rule that, given the inputs fromthe sample values and hence sampling distribution, chooses to either reject orelse not reject (intuitively similar in meaning to “accept”) the null H0. Giventhis rule, the set of sample outcomes or sample values that lead to rejection ofthe H0 is called the critical region. If H0 is true but is rejected, a Type I error iscommitted. If H0 is false but is accepted, a Type II error is committed.

    The statistical rule on H0:  = 1, HA:   1, is that if the test statistic

    ns/

    1Xt   n1n

      which is t-distributed with (n-1) degrees of freedom, falls

    within the critical region (shaded), defined as {tn-1 < -a or tn-1 > +a} , a>0, asshown in Figure 1.6 below, then H0 is rejected in favor of HA. Otherwise H0 is

    not rejected and is “accepted”. 

    Figure 1.6

    Critical Region under the Null Hypothesis H0:  = 1 

    If H0 were true, then the t-distribution would be correct, and therefore the probability of rejecting H0 would be the area of the critical region, or 5% inthis case. Notice that for n=61, P(-2 < t60  < 2) = 0.95. Moreover, the t-distribution is symmetrical, so each of the right and left shaded tails makes up

    2.5%. This is called a 2-tailed test with a significance level of 5%. Thesignificance level is the probability of committing a Type I error when H0  istrue. In the above example, if the sample t-statistic is 1.045, then it is < 2, andwe cannot reject H0  at the 2-tailed 5% significance level. Given a sample t-value, we can also find its p-value which is the probability under H0  of t60 exceeding 1.045 in a one-tailed test, or of exceeding |1.045| in a 2-tailed test.In the above 2-tailed test, the p-value of a sample statistic of 1.045 would be 2

     Prob(t60 > 1.045) = 2  0.15 = 0.30 or 30%. Another way to verify the testis that if the p-value < test significance level, reject H0; otherwise H0 cannot

     be rejected.In theory, if we reduce the probability of Type I error, the probability of

    Type II error increases, and vice-versa. This is illustrated as follows.

    -a +a0

    X

    tn-1

  • 8/17/2019 Financial Valuation and Econometrics

    31/480

    21

    Figure 1.7

    Suppose H0 is false, and  > 1, so the true tn-1 distribution is represented by thedotted curve in Figure 1.7. The critical region {tn-1  < -2.00 or tn-1 > 2.00}remains the same, so the probability of committing Type II error is 1 –  sum ofshaded areas. Clearly, this probability increases as we reduce the criticalregion in order to reduce Type I error. Although it is ideal to reduce both typesof errors, the tradeoff forces us to choose between the two. In practice, we fixthe probability of Type I error when H0 is true, i.e. determine a fixedsignificance level e.g. 10%, 5%, or 1%. The power of a test is the probabilityof rejecting H

    0 when it is false. Thus power = 1 –  P(Type II error). Or power

    equals the shaded area in Figure 1.7. Clearly this power is a function of the

    alternative parameter value   1. We may determine such a power functionof   1.

    Thus reducing significance level also reduces power and vice-versa. Instatistics, it is customary to want to design a test so that its power function of

      1 equals or exceeds that of any other test with equal significance level forall plausible parameter values   1 in HA. If this test is found, it is called auniformly most powerful test.

    We have seen the performance of a 2-tailed test. Sometimes we embarkinstead on a one-tailed test such as H0:   = 1, HA:   > 1, in which wetheoretically rule out the possibility of  < 1, i.e. P( < 1) = 0. In this case, itmakes sense to limit the critical region to only the left side, for when  > 1,then tn-1 will become smaller. Thus at the one-tail 5% significance level, thecritical region is {tn-1 < -1.671} for n=61.

    1.6 DATA TYPES

    Consider the types of data series that are commonly encountered in regressionanalyses. There are four generic types, viz.

    -2 20

    X

    tn-1 pdf f(X)

  • 8/17/2019 Financial Valuation and Econometrics

    32/480

    22

    (a) Time series(b) Cross-sectional(c)  Pooled Time Series Cross-Sectional(d)  Panel/longitudinal/micropanel

    Time series are the most prevalent in empirical studies in finance. They aredata indexed by time. Each data point is a realization of a random variable at a

     particular point in time. The data occur as a series over time. A sample of suchdata is typically a collection of the realized data over time such as the historyof ABC stock‟s prices on a daily basis from 1970 January 2 till 2002December 31.

    Cross-sectional data are also common in finance. An example is the

    reported annual net profit of all companies listed on an exchange for a specificyear. If we collect the cross sections for each year over a 20-year period, thenwe have a pooled time series cross section of companies over 20 years. Paneldata are less used in finance. They are data collected by tracking specificindividuals or subjects over time and across subjects.

    The nature of data also differs according to the following categories.

    (a)  Quantitative(b)  Ordinal e.g. very good, good, average, poor(c)  Nominal/categorical e.g. married/not married, college graduate/non-

    graduate

    Quantitative data such as return rates, prices, volume of trades, etc. havethe least limitations and therefore the greatest use in finance. These data

     provide not only ordinal rankings or comparisons of magnitudes, but alsoexact degrees of comparisons. There are some limitations and thereforespecial considerations to the use of the other categories of data. In the

    treatment of ordinal and nominal data, we may have to use specific tools suchas dummy variables in regression.

    1.7 PROBLEM SET

    1.1 X, Y, Z are r.v.‟s with a joint pdf f(X,Y,Z) that is integrable. Show usingthe concept of marginal pdf‟s that E(X+Y+Z) = E(X)+E(Y)+E(Z) byintegrating over (X+Y+Z).

    1.2 Show how one could express   

      

     N

    1 j

     j

     N

    1i

    i   X,Xovc in terms of the N by N

    matrix covariance matrix   Nx N ?

  • 8/17/2019 Financial Valuation and Econometrics

    33/480

    23

    1.3 The following is the probability distribution table of a trivariate U1, U2,and U3.

    U1 -1 -1 -1 -1 1 1 1 1

    U2 -2 -2 2 2 -2 -2 2 2

    U3  -3 3 -3 3 -3 3 -3 3

    P(U1,U2,U3) .125 .125 .125 .125 .125 .125 .125 .125

    Find the bivariate probability distribution P(U1, U2). Find the marginalP(U3).

    1.4 In the probability distribution table of a trivariate U1, U2, and U3,

    U1 -1 -1 -1 -1 1 1 1 1

    U2 -2 -2 2 2 -2 -2 2 2

    U3  -3 3 -3 3 -3 3 -3 3

    P(U1,U2,U3) .125 .125 .125 .125 .125 .125 .125 .125

    after finding P(U1,U2), suppose Yi = bXi + Ui , i=1,2, and X1=1, X2=2,

    (i)  Find E(Ui)‟s, and cov(U1, U2).

    (ii)  Find the probability distribution of estimator1

    2

    1

    22

    1

    ˆ

     

      

       

    i

    i

    i

    ii   X Y  X b . This probability distribution of the

    estimator is called the sampling distribution of  b̂ .

    (iii)  Find the mean and variance of  b̂  from its probability distribution.

    1.5 X, Y have joint pdf f(X,Y) = exp(-X-Y) for 0

  • 8/17/2019 Financial Valuation and Econometrics

    34/480

    24

    3K number of random values Z j belonging to univariate normal N(0,1)distribution, and Wi = 0.5Z3i-2 + 0.3Z3i-1 + 0.2Z3i for i=1,2,....,K, what is

    the variance of the sampling mean

      K 

    1i

    i

    1 WK  ?

    1.8 Suppose r.v.  

      

     60

    1,0 N~X i   for i=1,2,….,K, and Xi  and X j  are

    independent when i  j. If AXi ~ N(0,1) where A is a constant, what is A?If random vector Y = (X1, X2, …… , XK ), what is the distribution of YY

    T?

    1.9 If cov(a,b) = 0.1, cov(c,a) = 0.2, cov(d,a) = 0.3, and x = b + 2c + 3d, whatis cov(a, x)?

    1.10 Suppose X, Y, Z are jointly distributed as follows.

    Probability X Y Z

    0.5 +1 -1 0

    0.5 - 1 0 +1

    Find cov(X,Y), cov(X,Z), and cov(Y,Z).

    FURTHER RECOMMENDED READINGS

    [1] Alexander M Mood, F A Graybill, and D C Boes, 3rd  or later editions,“Introduction to the theory of statistics,” McGraw-Hill publisher.[2] Robert V Hogg, and Allen T Craig, “Introduction to mathematicalstatistics,” 4th or later editions, Collier MacMillan publisher.

  • 8/17/2019 Financial Valuation and Econometrics

    35/480

    25

    Chapter 2

    STATISTICAL LAWS AND 

    CENTRAL LIMIT THEOREM

    APPLICATION: STOCK  RETURN DISTRIBUTIONS 

    Key Points of Learning

    Stochastic process, Stationarity, Law of large numbers, Central limit theorem,Rates of return, Lognormal distribution, Information sets, Random walk, Law

    of iterated expectations, Unconditional expectation, Conditional mean,Conditional variance, Jarque-Bera test

    In this chapter, we shall build on the fundamental notions of probabilitydistribution and statistics in the last chapter, and extend consideration to asequence of random variables. In financial application, it is mostly the casethat the sequence is indexed by time, hence a stochastic process. Interestingstatistical laws or mathematical theories result when we look at the

    relationships within a stochastic process. We introduce an application of theCentral Limit Theorem to the study of stock return distributions.

    2.1 STOCHASTIC PROCESS

    A stochastic process is a sequence of random variables X1, X2, X3, …. , and so on. Each Xi  has a probability density function  or pdf. A common type of

    sequence is indexed by time t1 < t2 < t3 < …. for ,,X,X,X321   ttt

      and so on.

    A stochastic process {Xi}i=1,2,….. is said to be weakly (covariance) stationary ifeach Xi  has the same mean and variance, and cov(Xi, Xi+k ) =   (k), i.e. afunction only dependent on k. As an example, suppose monthly stock return

    rates tr ~  where

    1r ~ = return rate in Jan 2009

    2r ~ = return rate in Feb 2009

    ……………………….. etc.

    form a stochastic process that is weakly stationary. If Var( 1r ~ )=0.25, what is

    Var ( 5r ~

    )? Clearly this is the same constant 0.25. If Cov( 1r ~ , 3r 

    ~)=0.10, what is

  • 8/17/2019 Financial Valuation and Econometrics

    36/480

    26

    cov( 7r ~ , 9r 

    ~ )? Clearly, this is 0.10 since the time gap between the two random

    variables is the similarly two months in either case.Suppose we have a realized history of the past 60 monthly return rates

    {r t}t=1,2,…,60. Each of these r t‟s is a known number, e.g. 0.01, one percent, or –  0.005, negative half percent. The realized number r t is a sample point taken

    from the pdf of random variable tr ~ . We have to learn to distinguish between

    what is a random variable that has an attached pdf, and what is a realized

    sample point that is a given number. Notice that sometimes a tilde “” is putover the variable to denote it as being random. The past history or realizedvalues of the stochastic process, {r t}t=1,2,…,60 , e.g. {0.010, -0.005, 0.003, 0.008,-0.012, ………., 0.008} is called a time series, which is a time-indexed

    sequence of sample points of each random variable tr ~ in the stochastic process

    { tr ~ }t.

    A stochastic process {Xi}i  is said to be strongly stationary if each set of{Xi, Xi+1, Xi+2, …, Xi+k } for any i and the same k has the same jointmultivariate pdf independent of i. As an example, consider joint multivariatenormal distributions, MVN. Suppose the following is strongly stationary,

      3x33x1d

    321   ),MVN(M~r ~,r ~,r ~ ,

    then clearly the joint multivariate pdf of 543   r ~,r ~,r ~ has the same MVN (M, ).There are two very important and essential theorems dealing with

    stochastic processes and therefore applicable to the study of time series ofempirical data. They are the Law of Large Numbers and the Central LimitTheorem.

    2.2 LAW OF LARGE NUMBERS

    The Law of Large Numbers (LLN) states that if x1, x2, …, xn  is a realizedsample randomly chosen from any random variable Xi with a fixed pdf whereeach time a draw is taken from an independent X i, then the sample average orsample mean converges to the expected value of random variable Xi or E(Xi).This is sometimes referred to as Kolmogorov‟s LLN when the convergencerefers to a sample mean taken from a time series, and the correspondingstochastic process is stationary and also independently distributed. The latterimplies that any X j  and Xk   within {Xi}i  are independent. We will discussconvergence in a later chapter, but for now, it suffices to understand it as

    “approaching in value” in some arbitrarily close fashion. Thus, the law oflarge numbers states:

    μxn

    1lim

    n

    1i

    in

      where E(Xi) = .

  • 8/17/2019 Financial Valuation and Econometrics

    37/480

    27

    An extension of the above, relaxing the assumption of independence, states

    that in a (stationary) ergodic stochastic process {Xt}t with mean , i.e. E(Xt) =  for all t, if x1, x2, …, xn  is a realized sample randomly chosen from the

    stochastic process {Xt}t , thenμX

    n

    1lim

    n

    1i

    in

    .

    An ergodic stochastic process is one in which two random variables in the process that are sufficiently far apart in time index tend toward (asymptotic)independence. The independently identically distributed process (i.i.d.

     process) is a special case of the stationary ergodic process. In manyapplications, usually stationary ergodicity or some related variation is

    assumed and then sample means are used to estimate population means.

    2.3 CENTRAL LIMIT THEOREM

    The Central Limit Theorem states that if X1, X2, …, Xn is a vector of random

    variables drawn from the same stationary distribution with mean   andvariance 2, and suppose we let

    n

    i

    i 1

    X

    n XnY or 

    n

      

           

       

    ,

    or else,

    n

    i

    i 1

    X nX

    Y or n

    n

      

      

     

    ,

    then Y is a random variable that converges in distribution, as the sample size napproaches infinity, to a normal standard random variable, i.e.

    1) N(0,~Ylimd

    n   .

    For sufficiently large n, suppose  

      

       

    μXn

    1

    ζ

    nY

    n

    1i

    i   N(0,1), then

    Yn

    ζμX

    n

    1   n

    1ii

     

      

     

      

     

    n

    ζμ, N

    2

    .

    Or, YμXn

    1i

    i   nn  

      2n,n N      . (2.1)

  • 8/17/2019 Financial Valuation and Econometrics

    38/480

    28

    This says that when n is large, the sample mean X , itself a random variable, isnormally distributed with mean  and variance 2/n.

    2.4 STOCK RETURN RATES

    In finance, the lognormal distribution is important. A pertinent example is astock price at time t, Pt . There are several empirically observed characteristicsof stock price that are neat and could be appropriately captured by thelognormal probability distribution model.

    (a) 0Pt   , i.e. prices must be strictly positive.(b) Return rates derived from stock prices over time are normally distributed

    when measured over a sufficiently long interval e.g. a month.(c) Returns could display a small trend or drift, i.e. increases or decreases over

    time.(d) The ex-ante (anticipated) variance of return rate increases with the holding

     period.

    We examine the case when t  P   is lognormally distributed to see if this

    distribution offers the above characteristics. Lognormally distributed Pt means

    that  Normal N,~lnPd

    t . Thus, Pt = exp(N) > 0 where N is a normal random

    variable. Hence (a) is satisfied. Likewise,  Normal~ln1

    t  P  . Therefore,

     Normal~lnPlnPd

    t1t    or,  Normal~P

    Pln

    d

    t

    1t

     

      

      .

     Now

    t 1

    t,t 1t

    P

    ln r P

      is the continuously compounded stock return overholding period or interval [t, t+1). If the time interval or each period is small,

    this is approximately the discrete return rate   1P

    P

    t

    1t . However, the discrete

    return rate is bounded from below by -1. Contrary to that, the return r t,t+1 has (-

    , ) as support, as in a normal distribution.We can justify how r t,t+1  can be reasonably normally distributed, or

    equivalently, that the price is lognormally distributed, over a longer timeinterval. Consider a small time interval or period   = 1/T, such that

  • 8/17/2019 Financial Valuation and Econometrics

    39/480

    29

    t

    t

    Pln

    P

    , the small interval continuously compounded return, is a random

    variable (not necessarily normal) with mean   = /T, and variance

    2

      =2/T. The allowance of small 0 in the above satisfies (c).Aggregating the returns,

    t t 2 t 3 t T t T

    t t t 2 t (T 1) t

    P P P P Pln ln ln ln ln .

    P P P P P

      (2.2)

    The right-hand side of equation (2.2) is simply the continuously compounded

    return t 1t,t 1

    t

    Pln r 

    P

      over the longer period [t,t+1), where the length is

    made up of T=1/ number of  periods. The left-hand side of equation (2.2),invoking the Central Limit Theorem, for large T, is N(T, T2) or N(,2)since T=1. Hence r t,t+1  N(,2), which satisfies (b) and justifies the use oflognormal distribution for prices.

    Moreover, t , t k  r 

    t k tP Pe 0

        even if return r t+k   may sometimes benegative. Suppose the returns r t,t+1  , r t+1,t+2  , r t+2,t+3  , ….. , r t+k-1,t+k   areindependent. Then

    k 12t k 

    t j,t j 1

     j 0t

    Pvar ln var r k .

       

    Thus, ex-ante variance of return increases with holding period [t, t+k). Thissatisfies characteristic (d).

    It is important to recognize that the discrete or holding period return rate

    t 1

    t

    P1

    P

     does not display some of the appropriate characteristics. The discrete

     period returns have to be aggregated geometrically in the following way.

       

    t 1t ,t 1

    t

    k 1 k 1t j 1t k 

    t j,t j 1

     j 0 j 0t t j

    P1 R 

    P

    PP1 R 0,

    P P

     

    The lower boundary of zero is implied by the limited liability of owners of

    listed stocks. This discrete setup is cumbersome and poses analyticalintractability when it comes to computing drifts and variances. It isstraightforward to compute the means and variances of sums of randomvariables as in the case of the continuously compounded returns, but not so for

  • 8/17/2019 Financial Valuation and Econometrics

    40/480

    30

     products of random variables when they are not necessarily independent, as inthe case of the discrete period returns here.

    2.5 CONDITIONAL MEAN AND VARIANCE

    Earlier we have seen how when two random variables X, Y are jointly bivariate normal, we can express the conditional mean or expectation of one interms of the other viz.

    XY

    E X | Y E X Y E Y 

       

    . (2.3)

    We could be more precise in the use of notation to denote the expectationof X:

    XEXE   XYX,  where the superscripts in the expectation operator denote that the integral istaken with respect to those random variables. We could also use small letter xto denote the sample realization of random variable X, although sometimes weignore this if the context is clear as to which is used, whether it is a random

    variable or a realized value. We could also employ notation y|XE   y|X   todenote an expected value taken on random variable X based on conditional

     probability of X given Y = y.When two random variables X, Y are not jointly normal, the linear

    relationship in equation (2.3) is not possible based just on distributionalassumptions. Instead we have to impose the linear relationship directly. Forexample, we may assume or specify:

    X = a + bY + e (2.4) where a, b are constants, and e is a random variable with zero mean and

    variance 2, and is independent of Y. Equation (2.4) is called a linear

    regression model, or a linear relationship connecting two or more randomvariables including at least one unobservable random variable e. Then

    E(X) = a + b E(Y). Now E(X|Y) = a + bY, since E(Y|Y) = Y, and E(e|Y) = E(e) = 0. Then E(X|Y)= a + b E(Y) + b[Y –  E(Y)] = E(X) + b[Y –  E(Y)].Also, cov(Y,X) = cov(Y,a) + b cov(Y,Y) + cov(Y,e) = b var(Y). Thus,

     b =Y

    X

    2

    Y

    XY

    . Hence we can write E(X|Y) = E(X) +Y

    X

     [Y –  E(Y)],

    which is identical to the case under bivariate normality. Thus, it may be seenthat the linear regression model in (2.4) plays a crucial role.

    From (2.4), var(X) = b2  var(Y) + 2 where var(e) = 2. But conditionalvariance var(X|Y) = 2. Therefore var(X|Y) < var(X). Thus it is seen that

  • 8/17/2019 Financial Valuation and Econometrics

    41/480

    31

     providing information Y reduces the ex-ante uncertainty or variance of X. Ofcourse, if X and Y are not linearly related, i.e. b = 0, then var(X|Y) = var(X),in which case knowing Y does not reduce the uncertainty. This idea ofreducing uncertainty with given relevant information Y about X is central inthe thinking and theory of finance. For example, if we know pertinentinformation about tomorrow‟s stock return  movements, then the risk ofinvesting in stocks will be suitably reduced.

    2.6 INFORMATION SET AND RANDOM WALK

    We have introduced the idea of a stochastic process earlier. Let the timesequence of random variables Pt , Pt+1 , Pt+2 , ….. represent the prices of a stockat time t, t+1, t+2, etc. Then {Pt}t is a stochastic process.

    Let ,....Φ,Φ,Φ 2t1tt    represent the information set at time t, t+1, t+2, etc.

    that is available to the decision-maker for making forecasts of the future price

    of the stock. We may interpret an information sett

    Φ essentially as a random

    variable that can take realized sample values t that are called information. A piece of information at different times t, t+1, t+2, etc., viz. t , t+1 , t+2 , ….can be thought of as some function of other random variable Yt at time t, i.e.

    t(yt). t has a joint density with Pt, and possibly also with Pt+1. Therefore,

    since Yt  and Pt, Pt+1, are jointly determined in a probabilistic manner, thengiven information t(yt), a better forecast of next period Pt+1 can be attained.

    Et(Pt+1)  E(Pt+1|t) is a conditional expectation or forecast of next periodPt+1 based on information available at t, i.e. t . Notice that subscript t is usedto denote evaluation of integral over information set at t, i.e. t. Suchapplications of conditional expectations are plentiful in the finance literature.Early studies in finance suggest simple stochastic processes for prices such asa random walk: 

    Pt+1 =  + Pt + et+1  (2.5)where  is a constant drift, and et+1 is a disturbance or white noise or a randomvariable that is independent of past information as well as prices. Equation(2.5) is sometimes called an arithmetic random walk in prices. The latter namearises since when an arithmetic or subtraction operation is performed on the

     prices such as taking the price difference, it is equal to a constant with anadded disturbance.

    Since Pt   t  , Et(Pt+1)   E(Pt+1|t) = E(Pt+1|Pt) as only Pt  is relevantaccording to the random walk process above. Other information within t except for Pt  is redundant in equation (2.5). This is an implication of the

    arithmetic random walk in prices. Thus, Et(Pt+1)  E(Pt+1|t) =  + Pt .

  • 8/17/2019 Financial Valuation and Econometrics

    42/480

    32

    If =0, then the best forecast of tomorrow‟s price is today‟s price Pt according to the random walk theory as in (2.5). Suppose we construct arandom walk in natural logarithm of price. Then

    ln Pt+1 =  + ln Pt + et+1 .This is sometimes called the geometric random walk in prices. Or,

    t 1t,t 1

    t

    Pln r 

    P

    =  + et+1. If we specify et+1  N(0, 2), then we are back to

    the lognormal model described earlier. Thus we see that we can constructmeaningful return rate distributions using the linear model of stochastic price

     process as in the random walk model above. The linear model is essentially adifference equation in the logarithms of price.

    Suppose the information set t (is a subset of)  t  , the Law of IteratedExpectation states:

    E[ E(Pt+1|t) |t] = E(Pt+1| t) .If we condition on the null set , E[ E(Pt+1|t) |] = E(Pt+1|) = E(Pt+1)that is also the unconditional expectation or unconditional forecast. Applyingthe Law of Iterated Expectations to information revelation over time,

    E[ E(Pt+2|t+1) |t] = E(Pt+2| t) since t  t+1 .Or, Et [ Et+1(Pt+2) ] = Et (Pt+2) . The best forecast of tomorrow (t+1)‟s forecast

    of X at t+2 is equal to the best forecast of P at t+2 today at t.

    2.7 LAW OF ITERATED EXPECTATIONS

    In the following, we shall show more formally that E[ E(Pt+1|t) |t] = E(Pt+1|t) when t  t. Although not necessary, but for convenience of proof, weshall assume that jointly distributed random variables X, Y, and Z have joint

     probability density function f(x, y, z). Then,  dzzy,x,f yx,f z . 

    The Law of Iterated Expectations can take various forms. At the simplestlevel, consider

     

     

     

    x|YEE

    dxxf x|YE

    dxxf dyx|yf y

    dxdyxf xf 

    yx,f y

    dxdyyx,f yYE

    x|YX

    x

    x|Y

    x y

    x y

    x y

    YX,

     

      

     

     

      

     

     

  • 8/17/2019 Financial Valuation and Econometrics

    43/480

    33

    where we use notation x|YE   x|Y   to denote an expected value taken onrandom variable Y based on conditional probability density function x|yf   expressed in the superscript to the expectation operator. We could also have

    used Y[x]E   x|Y   indicating the integrand is y that is a function of x.Similarly, we should be able to show

     

     

     

    .zx,|YEE

    dxz|xf zx,|YE

    dxz|xf dyzx,|yf y

    dxdyz|xf z|xf 

    z|yx,f y

    dxdyz|yx,f yz|YE

    zx,|Yz|X

    x

    zx,|Y

    x y

    x y

    x y

    z|YX,

        

     

      

     

     

    We can think of random variables {X,Z} as the information sett, and values zx,  as the realized information t at time t. Likewise {Z} is an information

    set t  , and clearly t  t  . Thus we may rewrite the equation of the law ofiterated expectations above as:   ttt   ς|Φ|YEEς|YE    

    where we have simplified the notation by dropping the superscripts to theexpectation operator as long as the integration is taken over a proper

     probability distribution and the condition is reflected by the argument term “ |

    t” or “ |t”.  Let 1tPY , then E[ E(Pt+1|t) |t] = E(Pt+1| t) as assertedearlier.

    2.8 TEST OF NORMALITY

    Given the importance of the role of the normal distribution in financial returnsdata, it is not surprising that many statistics have been devised to test if agiven sample of data {r i} comes from a normal distribution. One such statisticis the Jarque-Bera (JB) test of normality.1  The test is useful only when thesample size n is large (sometimes, we call such a test an asymptotic test). The

    JB test statistic   22

    d22

    χ ~24

    3κ 

    6

    γn

     

       

    1 See C M Jarque and A K Bera, (1987) “A Test for Normality of Observations andRegression Residuals,” International Statistical Review, vol.55, 163-172.

  • 8/17/2019 Financial Valuation and Econometrics

    44/480

    34

    where   is the skewness measure of {r i} , and   is the kurtosis measure of

    {r i}. The inputs of these measures to the JB test statistic are usually sampleestimates. For {r i} to follow a normal distribution, its skewness sample

    estimate   should converge to 0, since the normal distribution is symmetricalwith third moment being zero, and its kurtosis sample estimate   shouldconverge to 3. If the JB statistic is too large, exceeding say the 95 th percentile

    of a2   distribution with 2 degrees of freedom, or 5.99, then the null

    hypothesis, H0, of normal distribution is rejected. The JB statistic is large if    

    and (  -3) deviate materially from zero.Recall that the p-value (“probability-value”) of a realized test statistic t*

     based on the null hypothesis distribution is either:

    (a) the probability of obtaining test statistic values whose magnitudes are evenlarger than t*, i.e. P(tt*) for one-tail right-tail test, or(b) the probability of obtaining test statistic values whose absolute magnitudes

    are larger than t* in a symmetrical zero mean null distribution, i.e. P(t-|t*| ort|t*|) for a two-tail test. Thus if in a statistical test, the significance level is setat %, and the p-value is x%, then reject H0 if x  , and “accept” H0 if x > .

    2.9 CORPORATE STOCK RETURNS

    In this section, we provide two examples of the stock return sampledistributions of two companies.

    The American Express Company (AXP) is one of the Dow Jones IndustrialAverage‟s 30 companies, and is a large diversified international companyspecializing in financial services including the iconic AMEX card services.It is globally traded. The AXP daily stock returns in a 5-year period from1/3/2003 to 12/31/2007 are collected from public source Yahoo Finance and

     processed as follows.  The return rates are daily continuously compounded

    return rates ln(Pt+1/Pt). Thus, weekly as well as monthly stock returns can beeasily computed from the daily return rates. The continuously compoundedweekly return rate would be the sum of the daily returns for the week, Mondayto Friday. The continuously compounded monthly return rate would be thesum of the daily returns for the month. As there are typically about 5 tradingdays in a week, since stocks are traded mainly through Stock Exchanges thatoperate on a 5-day week, the weekly return is computed from the sum of 5daily return rates. Similarly, there are only about 21 to 22 trading days onaverage in a month for adding up to the monthly return. Yearly or annual

    return will be summed over about 252 to 260 trading days in a year. (Anoutlier of more than 12% drop in a single day in the database on 2005 October3rd was dropped.)

  • 8/17/2019 Financial Valuation and Econometrics

    45/480

    35

    Tuesday‟s return is usually computed as the log (natural logarithm) ofclose of Tuesday‟s stock price relative to close of Monday‟s price. Unlikeother days, however, one has to be sensitive to the fact that Monday‟s returncannot be usually computed as the log (natural logarithm) of close ofMonday‟s stock price relative to close of Friday‟s price. The latter returnspans 3 days, and some may argue that the Monday daily return should be athird of this, although it is also clearly the case that Saturday and Sunday haveno trading. Some may use closing price relative to the opening price on thesame day to compute daily returns. Open-to-close return signifies returncaptured during daytime trading when the Exchange is open. However, close-to-open return signifies price change taking place overnight. We shall not beconcerned with these issues for the present purpose.

    The three series of daily, weekly, and monthly return rates are tabulated inhistograms. Descriptive statistics of these distributions such as mean, standarddeviation, skewness, and kurtosis are reported. The Jarque-Bera tests fornormality of the distributions are also conducted.

    Figure 2.1

    Histogram and Statistics of Daily AXP Stock Return Rates

    0

    40

    80

    120

    160

    200

    240

    280

    -0.06 -0.04 -0.02 0.00 0.02 0.04 0.06

    Series: DRSample 1 1257Observations 1257

    Mean 0.000377Median -0.000210Maximum 0.063081Minimum -0.057786Std. Dev. 0.013331Skewness 0.129556Kurtosis 5.817207

    Jarque-Bera 419.1988Probability 0.000000

     

    In Figure 2.1, the JB test statistic shows a p-value of

  • 8/17/2019 Financial Valuation and Econometrics

    46/480

    36

    In Figure 2.2, the JB test statistic shows a p-value of

  • 8/17/2019 Financial Valuation and Econometrics

    47/480

    37

    In the tables of the Figures, note that sample size n=1257 for the dailyreturns, n=251 for the weekly returns, and n=57 for the monthly returns. The

    mean formula is n

    μˆ

    n

    1i   i . The standard deviation (Std. Dev.) formula is

    1n

    n   2μ̂i

    r ζ̂

      1i

     

      . The skewness and kurtosis formulae are respectively,

    3

    n

    1i

    3

    i

    ζ̂n

    μ̂r γ̂

        and 4

    n

    1i

    4

    i

    ζ̂n

    μ̂r κ ˆ

        .It is interesting to note that daily and weekly stock return rates are usually not

    normal, but aggregation to monthly return rates produces normality as would be expected by our earlier discussion on Central Limit Theorem. This resulthas important implications in financial modeling of stock returns. Shortinterval return rates should not be modeled as normal given our findings. Infact, the descriptive statistics of the return rates for different intervals aboveshow that shorter interval return rates tend to display higher kurtosis or “fat”tail in the pdf. Many recent studies of shorter interval return rates introduceother kinds of distributions or else stochastic volatility to produce returns with

    “fatter” tails or higher kurtosis than that of the normal distribution. The next example is that of the Overseas Chinese Banking Corporation(OCBC) which is one of the 3 largest banks in Singapore. OCBC is a strong

     blue-chip stock with plenty of liquidity in trading. The OCBC bank dailystock returns in a 5-year period from 10/27/1997 to 10/25/2002 are collectedfrom the Singapore Stock Exchange (SGX) source and processed as follows. The return rates are daily continuously compounded return rates ln(Pt+1/Pt).Weekly as well as monthly stock returns are computed from the daily returnrates. Likewise, the three series of daily, weekly, and monthly return rates are

    tabulated in histograms shown in Figures 2.4, 2.5, and 2.6. Descriptivestatistics of these distributions such as mean, standard deviation, skewness,and kurtosis are reported. The Jarque-Bera tests for normality of thedistributions are also conducted.

    As in the case of American Express Company, the daily return rates ofOCBC show very high kurtosis deviating from normality. There is alsoskewness. In Figures 2.4 and 2.5, the JB test statistics show p-values less than0.000. Tthus normality is rejected at significance level 0.0005 or 0.05% for

    the daily as well as the weekly returns.From Figure 2.4 the mean return in the sampling period is 0.0259% perday, or about 253 × 0.0259 = 6.55% per annum. The daily return standard

  • 8/17/2019 Financial Valuation and Econometrics

    48/480

    38

    deviation or volatility is 2.527%. The annual volatility may be computed as

    252 × 0.02527 = 40.19%.

    Figure 2.4Histogram and Statistics of Daily OCBC Stock Return Rates

    0

    100

    200

    300

    400

    500

    -0.10 -0.05 0.00 0.05 0.10 0.15

    Series: DR

    Sample 1 1305

    Observations 1305

    Mean 0.000259

    Median 0.000000

    Maximum 0.163979

    Minimum -0.116818Std. Dev. 0.025270

    Skewness 0.297441

    Kurtosis 6.720127

    Jarque-Bera 771.7571

    Probability 0.000000

     

    Figure 2.5

    Histogram and Statistics of Weekly OCBC Stock Return Rates

    0

    10

    20

    30

    40

    50

    60

    70

    -0.2 -0.1 0.0 0.1 0.2

    Series: WR

    Sample 1 1301

    Observations 255

    Mean 0.001324

    Median 0.007725

    Maximum 0.203466Minimum -0.261330

    Std. Dev. 0.061151

    Skewness -0.272323

    Kurtosis 5.723938

    Jarque-Bera 81.98756

    Probability 0.000000

     

    In Figure 2.6, the JB test statistic shows a p-value of 0.858. Thus normalityis not rejected at significance level 0.10 or 10%. In the tables of the Figures,

  • 8/17/2019 Financial Valuation and Econometrics

    49/480

    39

    note that sample size n=1305 for the daily returns, n=255 for the weeklyreturns, and n=65 for the monthly returns.

    Figure 2.6

    Histogram and Statistics of Monthly OCBC Stock Return Rates

    0

    1

    2

    3

    4

    5

    6

    7

    8

    -0.2 -0.1 0.0 0.1 0.2

    Series: MR

    Sample 1 1281

    Observations 65

    Mean 0.005055

    Median 3.76E-16

    Maximum 0.244560

    Minimum -0.247624

    Std. Dev. 0.113431

    Skewness -0.081671

    Kurtosis 2.706749

    Jarque-Bera 0.305165

    Probability 0.858488

     

    2.10 PROBLEM SET

    2.1 There is a sample size of 60 taken from an unknown distribution exceptthat the variance is known to be 0.24. The sample mean is 0.5. Findnumbers a, b, so that the unknown population mean lies in the interval(a,b) with a 95% probability. (a,b) is called the 95% confidence intervalfor the population mean. What theorem are we implicitly using in derivingthis result?

    2.2 Let R t,t+ be the continuously compounded rate of return over interval(t,t+], where  is small. Suppose R t,t+, R  t+,,t+2, etc. are each i.i.d. withmean  and variance 2. Suppose an interval of 1 month is divided into

     N such -intervals.

    (i)  Explain why monthly continuously compounded return rate R t,t+N may be approximated by N( N, 2 N).

    (ii)  Given monthly return rate time series R 1, R 2, …… , R 60 over 5 years,

    explain how you would construct a test of normality for R t (assumingstationary stochastic process) showing a te