Fooled by Compounding Jpm 2012.38.2

7/31/2019 Fooled by Compounding Jpm 2012.38.2

http://slidepdf.com/reader/full/fooled-by-compounding-jpm-2012382 1/9

108 Fooled by Compounding Winter 2012

Fooled by CompoundingR. DaviD McLean

R. DaviD McL ean

is an associate professor of

finance at the Universityof Alberta in Edmonton,

AB, Canada, and a vis-

iting assistant professor

of finance at MIT in

Cambridge, MA.

[email protected]

Compounding can make things

appear to be larger than they really

are. This effect can arise when

returns resulting from an event

are compounded over a long holding period. In

this setting it is not uncommon for authors to

describe returns resulting from compounding

as a return caused by the individual event.

This mistake results in exaggerating the sig-

nif icance of the event. In this article, I review

several examples of this common mistake,

which are found in a popular book on rare

events, newspaper articles, investment advi-

sors’ research reports, and finance journalarticles. I also describe alternative methods of

return measurement that are not affected by

compounding and show that these methods

can lead to different inferences than measures

that include compounding.

For an understanding of how com-

pounding can distort things, consider the fol-

lowing example. A portfolio normally yields

a return of 1% per month. In one month the

portfolio has an abnormal event, which yields

a monthly return that is greater than 1%. A

benchmark portfolio has a return of 1% inevery month. An analyst wants to commu-

nicate the significance of the event. To do so

she calculates the buy-and-hold return of the

portfolio and compares it to that of the bench-

mark over holding periods of 1, 5, 10, and 50

years. The event occurs in month t , and the

returns are measured beginning in month t

through the subsequent holding periods.

Over the 1-year holding period, the port-

folio has a buy-and-hold return of 13.80%, while

the benchmark has a buy-and hold return of

12.68%. The analyst reports an abnormal buy-

and-hold return of 13.80% − 12.68% = 1.12%.

The analyst repeats this exercise and computes

abnormal buy-and hold-returns of 1.80%,

3.27%, and 387.71% over the 5-, 10-, and 50-

year holding periods, respectively. Hence, for

the same event, we have four different abnormal

returns, which range from 1.80% to 387.71%.

Which abnormal return accurately describes themagnitude of the return in the event month?

The point that I strive to make in this

article is that none of the previous returns

accurately describes the abnormal return in

the event month. In the preceding example,

the return in the event month is 2%, so the

abnormal return in the event month is simply

2% − 1% = 1%. If the analyst wants to com-

municate the size of the event, she can simply

state that the return in the event month is 1%

larger than the returns in the other months.

The problem with the buy-and-hold returns isthat they ref lect both the event and the com-

pounding, and therefore do not reveal how

important the event was.

Compounding in this setting multiplies

the abnormal return in the event month by

the compound factor from the other months.

To see this, let E equal the return in the event






By removing the ten biggest one-day moves from

the U.S. stock market over the past f ifty years we

see a huge difference in returns—and yet conven-

tional finance sees these one day jumps as mere

anomalies.

The issue here is to attr ibute the total dif ference in

buy-and-hold returns to “one-day jumps” that occurred

on the 10 largest days. When the 10 largest days are

removed from the total buy-and-hold return of the S&P

500 over a 50-year period, not only are the returns of

the 10 days removed, but so is the compounding of those

days’ returns over the 12,829 other days in the holding

period, and this is what really matters. If half of thereturns from the S&P 500 over the last 50 years were

due to jumps, then these 10 jumps should be visible in

e x h i b i t 1

ivst th S&p 500 vr th lst 50 yrs

This exhibit plots the value of $1 that is invested in the S&P 500 during the period 1955 to 2005. In Panel A, the value is also computed

excluding the 10 days with the highest returns. In Panel B, the value is also computed excluding both the 10 days with the highest returns

and the 10 days with the lowest returns.



the Journal oF portFoliomanagement 111Winter 2012

Exhibit 1, which plots this investment. Do we observe

10 jumps in the line that plots the return of the S&P

500 in Exhibit 1?

In March 2006, the Financial Times ran a series of

articles called “Mastering Uncertainty.” Benoit Man-

delbrot and Nassim Taleb wrote an article for this seriestitled “A Focus on the Exceptions That Prove the Rule.”1

As in The Black Swan, the point of the article is largely to

show that just a few outliers account for the bulk of many

things (e.g., book sales, internet searches, and wealth).

Mandelbrot and Taleb [2006] applied this framework to

the stock market, and refer to a graph similar to that in

Exhibit 1, stating the following:

Taken together, these facts should be enough to

demonstrate that it is the so-called outlier and not

the regular that we need to model. For instance, a

very small number of days account for the bulk of

the stock market changes: just 10 trading days repre-

sent 63 per cent of the returns of the past 50 years.

It is easy to show that virtually all of what Taleb

[2007] and Mandelbrot and Taleb [2006] were refer-

ring to is not returns on the ten largest days, but rather the

compounding of those returns over thousands of other days.

Panel A of Exhibit 1 shows the total returns of an investor

who invested $1 in the S&P 500 in 1955 and held that

investment through 2005 (the data are from CRSP),

as reported by Taleb [2007] and Mandelbrot and Taleb[2006]. The investment is shown both with and without

the 10 largest-return days for the S&P 500. When the

10 largest days are included, the investor has terminal

wealth of $191. When the 10 largest days are excluded,

the investor’s terminal wealth is $112. Hence, excluding

the 10 largest days costs the investor $78.92, or 41% of

her total return over the 50-year holding period.

Panel A of Exhibit 2 displays the returns of the 10

highest-return days, which range from 4.77% to 8.81%.

If the investor were to invest only $1 on each of these 10

days, in isolation, the investor would earn $0.55 in dollar

returns. Of the $78.92 reduction in terminal value thatresults from excluding the 10 largest days, only $0.55, or

0.7%, comes from the returns that could be independently

achieved on those 10 days. The other $78.37, or 99.93%,

results from compounding, as shown in Exhibit 3.

To fur ther understand the problems of attr ibuting

half of the S&P 500’s returns to just 10 days, consider

Panel B of Exhibit 1. Like Panel A, Panel B of Exhibit 1

shows the total return of an investor who invested $1 in

the S&P 500 in 1955 and held it through 2005. Panel B

shows this investment with all days and compares it to an

investment that removes both the 10 largest-return and

10 worst-return days. The investment that excludes both

the 10 best and 10 worst days has a terminal value of $253,which is 32% greater than the investment that includes

all of the days ($191). Clearly, the 10 highest-return days

do not represent half of the market’s returns, because

we can make an investment that excludes them and get

a larger return. Why does excluding the 10 worst days

make such a big difference? Panel B in Exhibit 2 shows

that the 10 worst days have larger returns (in absolute

value) than the 10 best, so compounding these returns

over thousands of other days has a greater effect than

does compounding the 10 largest returns.

e x h i b i t 2

Th 10 Hhst- 10 lwst-Rtr dsfr th S&p 500, 1955–2005

This exhibit reports the 10 highest-return days (Panel A) and the

10 lowest-return days (Panel B) for the S&P 500 during the period

1955–2005. The far right column reports the simple interest that

would be earned from investing $1 on each of these days.




For those who want to make statements regarding

the magnitude of high-return days, they can simply mea-

sure the returns on these days and compare them to the daily

mean return of the sample. With respect to the S&P 500,

its mean daily return over the period 1955–2005 is 0.045%,

and its standard deviation is 0.908%. Each of the returns on

the 10 largest days that are described in Exhibit 2 is more

than five standard deviations greater than the sample’s

mean. The returns on these days are certainly outliers, but

in isolation they do not account for nearly half of the total

returns that could be achieved by investing in the S&P

500 over this 50-year period.

ert th irtcf Hh-Rtr ds: mr es

Journalists also exaggerate the effects of high-returndays with compounding. As an example, an article that

appeared in The New York Times on October 8, 2008,

attributed the effects of compounding over a 40-year

period to returns that occurred over just 90 days:

From 1963 to 2004, the index of American stocks

he tested gained 10.84 percent annually in a

geometric average, which avoided overstating the

true performance. For people who missed the 90

biggest-gaining days in that period, however, the

annual return fell to just 3.2 percent. Less than

1 percent of the trading days accounted for 96

percent of the market gains.

Even professional investment advisors make this

mistake. The statement from The New York Times article

referred to two studies conducted by Towneley Capital

Management. Following is an excerpt from the intro-

duction of a study by Towneley [2005]2:

What surprised us, however, was the conclusion

that practically all of the market’s gains or losses over

several decades occurred during only a handful of

days or months. For example, in the original study,95% of market gains between 1963 and 1993 were

generated during a mere 1.2% of the trading days.

Many people have contacted us over the last

10 years asking for copies of the study. Recently,

however, we began receiving requests for an

updated version. Since, we also were curious to see

how the last decade might have changed result s,

e x h i b i t 3

Wh ec th 10 lrst ds mttrs

This exhibit breaks out the total difference between investing $1 in the S&P 500 during the period 1955–2005 and investing $1 in the

S&P 500 over the same period, but excluding the 10 largest-return days. The total difference is divided into simple interest and compounding

effects. Simple interest from the 10 largest days is the sum of the simple interest that could be earned by investing $1 on each of these days.

The difference due to compounding is the portion of the total difference that is not the result of the simple interest that could be earned byinvesting on these 10 days.




we asked Dr. Seyhun to revise the study, incor-

porating data through the year 2004. The results

were virtually unchanged: 96% of market gains

between 1963 and 2004 occurred during only

0.9% of the trading days.

It is not surprising that Towneley found similar

results in both samples, as compounding worked the same

from 1963 to 1993 as it did from 1993 to 2004. Towneley

is not the only major investment advisory firm to mistake

the effects of compounding for returns created over just a

few days. The following statement was obtained from the

website of John Hancock Investment Advisors [2008]:

Market upswings are as unpredictable as declines,

and history shows that a significant amount of

the long-term return available from investing in

stocks comes from gains made in a relatively small

number of trading days.

Finance professors and f inance journal editors also

can be confused by compounding. The following state-

ment was made by Estrada [2008]:

As these figures show, in all cases a very small

number of days accounts for the bulk of returns

delivered by emerging equity markets. Investors

in these markets do not obtain their long-term

returns smoothly and steadily over time but largelyas a result of booms and busts. A neglig ible pro-

portion of days determine a massive creation or

destruction of wealth.

In fact, the opposite is true. Investors do earn their

returns smoothly over time, as Exhibit 1 shows, and not

in a few boom and busts, as Estrada claimed. Estrada

[2009] made a similar misstatement in another article:

As these figures show, in all cases a very small

number of days account for the bulk of returns

delivered by equity markets. Investors do not obtain

their long-term returns smoothly and steadily over

time but largely as a result of booms and busts.

These examples show that the effects of compounding

exaggerate the size of high-return days. In each of these

examples, both the actual returns on high-return days and

the compounding of these returns over thousands of other

days are referred to as returns that occurred on just the

high-return days.

buy-and-Hold abnoRmalReTuRnS (bHaRs)

Compounding can distort inference in event studies

and mutual fund performance measurement. This section

describes several examples of these effects. I describe two

return-measurement methodologies that are not distorted

by compounding: cumulative abnormal returns (CARs)

and average abnormal returns (AARs).

evt Sts

As discussed earlier, compounding can also create

confusion in abnormal return measurement. This problem

can arise when buy-and-hold abnormal returns (BHARs)

are used to test whether a portfolio’s returns exceeds those

of its benchmark over long holding periods, as in the

example given at the beginning of this article. BHAR

is a comparison between the total buy-and-hold return

of a portfolio and that of its benchmark. BHAR, there-

fore, is affected by compounding. BHAR is computed

as follows:

Mitchell and Staf ford [2000] and Fama [1998] also

described problems that can ar ise with BHAR because

of compounding.3 Fama cited the results of Desai and

Jain [1997] and Ikenberry, Rankine, and Stice [1996]

as examples of how BHARs can distort inference;

both studies used BHARs to analyze abnormal returns

following stock splits. Ikenberry, Rankine, and Stice

reported that stock splits generate a one-year BHAR of

7.93%, while the BHARs in the second and third years

are zero. The BHAR over the entire period (three-year

BHAR) is 12.15%. There were no differences between

the portfolios in years two and three, yet the BHAR stil lgrows during these years because of compounding.

To see how compounding can distort inference

with BHAR in a generic setting, consider an example

similar to the one at the beginning of this article.

A firm has a return of 2% in the month following an

event, while the benchmark has a return of 1% in the

same month. Both the firm and its benchmark have




returns of 1% in all other months. If we measure the

BHAR in just the month following the event, it is 1%

(1 × 1.02 − 1 × 1.01). If we measure the returns over one

year, the buy-and-hold return for the firm is 13.80% and

for the benchmark it is 12.68%; the BHAR is therefore

1.12%. When the measurement horizon is extended to5 years, the BHAR is 1.80%, and if extended to 10 years,

the BHAR is 3.27%. Hence, as the horizon grows, the

BHAR also grows, even though the returns are identical

after the first month. For this reason the 3-year BHAR

in Ikenberry, Rankine, and Stice [1996] exceeds the

1-year BHAR, even though the second- and third-years

BHARs are zero.

Why do people use BHARs? Barber and Lyon [1997]

pointed out that BHARs accurately measure the investor’s

experience. Indeed, it is true that if an investor were to

invest in the stock-split portfolio described by Ikenberry,

Rankine, and Stice [1996], then his terminal wealth would

be 12.15% higher three years later compared to holding the

benchmark during the same three-year period. It is also

true, however, that the investor could have switched to

the benchmark portfolio after the end of the first year and

achieved the same return at the end of the third year.

When choosing a return-measurement method-

ology, it is important to consider the questions you are

trying to answer. In the preceding example, the questions

are, do stock splits create abnormal returns, and if so, how

big are they? BHAR does not give an accurate answer to

either question because it reports not only the abnormalreturns, but also the compounding of those returns over

the holding period.

mt Fs

The same type of inference problems with event

studies can also happen when analyzing mutual fund

returns. As an example, inference problems can occur

when investors compare charts that plot the dollar value

of an investment in a portfolio versus that in a bench-

mark portfolio. The findings in Evans [2010] highlight

the potential for such a problem.

Evans documented an incubation process in which

mutual fund companies privately start new funds and

then bring some of them public after an incubation

period. Only the funds with strong performance are

brought public, while those with weak performance

are terminated. The funds that are brought public have

abnormal returns of 3.5% per year during the incubation

period (Note that if a company starts 10 new funds, then

by luck we would expect 5 of the funds to beat the index

in the first year.). After the incubation period, however,

the average abnormal return for these funds is zero. Plot-

ting the growth of a dollar invested, or using BHAR tomeasure performance over the entire life of an incubated

fund, could make it appear as if the fund continued to

have abnormal performance after the incubation period,

when in fact there was none.

As an example, assume that a fund was incubated

for a year and had a return of 13.5% during that year.

Assume the fund’s benchmark had an average return

of 10% per year, and that after the fund was brought

public, it tied its benchmark over the next five years,

as is typical for incubated funds. $100 invested in the

benchmark would yield $110 after the first year and

$161.05 at the end of the fifth year. $100 invested in the

fund would yield $113.50 after the first year and $166.18

at the end of the fifth year. Hence, if the mutual fund

company plotted the growth of $100 over the entire

five-year period, it would appear that the fund created

$166.18 − $161.05 = $5.13 in value. But all that is hap-

pening is just the compounding of the first year’s abnormal

performance, which was $113.50 − $110.00 = $3.50, and

was never available to the investor:

Mutual Fund’s Five-Year

Terminal Value: $113.50×

1.104

=

$166.18Benchmark’s Five-Year

Terminal Value: $110.00 × 1.104 = $161.05

An investor in this case probably should not expect

to do any better with the fund than with the benchmark,

but compounding may fool her into believing otherwise.

The longer the holding period, the better the fund looks,

even though the fund is not creating any value after the

first year.

More generally, even if a manager does produce

abnormal performance, comparing the terminal value of

an investment in a fund to the terminal value of an iden-tical investment in the fund’s benchmark, as is commonly

done in the mutual fund industry, is an imprecise way

to measure a manager’s performance. This is because the

difference in terminal values contains both the abnormal

performance and the compounding of the abnormal per-

formance over the measurement period.




atrtv msrs f Rtrs

Two return methodologies that are not inf luenced by

compounding are cumulative abnormal returns (CARs)

and average abnormal returns (AARs), which are also

known as “calendar time” abnormal returns. Both of these methods, and especially AAR, are advocated

over BHAR by Fama [1998] and Mitchell and Stafford

[2000]. To compute CAR, we subtract the monthly

return of the benchmark from that of the portfolio, and

sum up the differences over the sample period,

To compute AAR, we subtract the monthly return

of the benchmark from that of the portfolio, and take

an average of the difference over the sample period,4

To see how these methods can yield a different

inference from BHAR, consider a mutual fund that has

a return of 2% in the first month and 1% thereafter, and

a benchmark portfolio that has a return of 1% per month

throughout the entire sample period. The appendix

reports the abnormal returns of the two funds with each of

these different measures over the various holding periods.

The AAR after one month is 1%. After one year, theAAR is 0.08% (0.01/12), and after five years, it is 0.01%

(0.01/60). Hence, as we increase the holding period, the

AAR in this example goes to zero, as it should, because

the manager only created value in the first month. In this

example, the outcome with AAR is the exact opposite of

the outcome with BHAR. The one-month BHAR is 1%,

the one-year BHAR is 1.12%, and the f ive-year BHAR

is 3.27%. The BHAR in this example will head toward

infinity as the holding period gets larger. The CAR in

this example is 1%, regardless of the holding period.

Which is the best method to use? In the fund man-

ager example, AAR may provide the clearest picture if the question we want to answer is, should we expect the

manager to create value in the future? Over the entire

five-year period of this example, the manager only beat

the index in 1 of 60 months, and that is likely due to

luck. The AAR suggests that the fund’s returns are not

significantly different from the benchmark’s returns,

which appears to be correct. Note that the AAR of the

incubated fund described in the previous section would

also go to zero over a suff iciently long holding period.

ConCluSion

Accurate abnormal return measurement is crucialfor understanding the significance of high- and low-

return days, corporate events such as stock splits, and

mutual fund performance. In this article, I show that

compounding can distort inference in each of these

instances. This is true because total holding-period

returns contain not only the return from the event itself,

but also the return compounded over days that did not

contain the event.

I show that there is a tendency to confuse returns

from compounding as the return from an event. The

distorting effect of compounding is more pronounced

when long holding periods are used, because the effectsof compounding increase with time. An alternative

methodology, known as average abnormal returns

(AARs), or the calendar-time portfolio approach, is

described. This method of return measurement is not

affected by compounding, and therefore may provide

more honest appraisals of event significance and fund

manager abil ity.

Compounding does have a place in return mea-

surement, however: whether to compound depends on

the question that we are trying to answer. If an investor

wants to know what her wealth will be at retirement,then we need to compound the returns of her investment

over the savings period. If we are try ing to measure how

big the abnormal return from a particular event is, then

we should not be compounding the event’s return over

long holding periods.

a p p e n D i x

CompaRiSon oF abnoRmal ReTuRnmeaSuRemenT meTHodologieS

This exhibit displays the abnormal returns from a port-folio that had a return of 2% in an event month and 1% in all

other months. The benchmark portfolio had a return of 1%

in all months. Buy-and-hold abnormal return (BHAR) is the

difference between the buy-and-hold returns of the portfolio

and the benchmark,




Cumulative abnormal return (CAR) is the sum of the

differences in monthly returns between the portfolio and

the benchmark,

Average abnormal return (AAR) is the average of the

difference in monthly returns between the portfolio and the

benchmark,

endnoTeS

The author is grateful to Claire Lang, Min Maung, Jay

Ritter, and Mengxin Zhao for helpful comments.1This art icle was reprinted in 2009.2The Towneley study is available at http://www.

towneley.com/pdf/MT%20Study%2004.pdf.3These articles also describe a number of statistical issues

that arise with BHAR.4Mutual fund alpha, a concept introduced by Jensen

[1968], is computed via an AAR methodology.

ReFeRenCeS

Barber, B., and J. Lyon. “Detecting Long-Horizon Abnormal

Stock Returns: The Empirical Power and Specification of

Test Statist ics.” Journal of Financ ial Economic s, 43 (1997),pp. 341-372.

Desai, H., and P. Jain. “Long-Run Common Stock Returns

Following Splits and Reverse Splits.” Journal of Business, 70

(1997), pp. 409-433.

Estrada, J. “Black Swans and Market Timing: How Not

to Generate Alpha.” The Journal of Investing , Vol. 17, No. 3

(2008), pp. 14-21.

——. “Invest ing in Emerging Markets: A Black Swan Per-

spective.” Corporate Finance Review , January/February 2009,

pp. 14-21.

Evans, R. “Mutual Fund Incubation.” Journal of Finance , 65

(2010), pp. 1581-1611.

Fama, E. “Market Efficiency, Long-Term Returns, and

Behavioral Finance.” Journal of Financial Economics, 49 (1998),

pp. 283-306.

Ikenberry, D., G. Rankine, and E. Stice. “What Do Stock

Splits Really Signal?” Journal of Financial and Quantitative Anal-

ysis, 31 (1996), pp. 357-377.

Jensen, M. “The Performance of Mutual Funds in the Period

1945–1964.” Journal of Finance , 23 (1968), pp. 389-416.

John Hancock Investment Advisors. “Saving for College in

a Volatile Market.” John Hancock Freedom 529 Market Volatility

Message , November 18, 2008. Available at www.johnhan-

cockfreedom529.com/public/site/page/0,,Market_Volatilty_

Message,00.shtm.

Mandelbrot, B., and N. Taleb. “A Focus on the Exceptions

That Prove the Rule.” Financial Times, March 23, 2006.

Mitchell, M., and E. Stafford. “Managerial Decisions and

Long-Term Stock-Price Performance.” Journal of Business, 73

(2000), pp. 287-329.

Taleb, N. The Black Swan: The Impact of the Highly Improbable .

New York, NY: Random House, 2007.

To order reprints of this article, please contact Dewey Palmieri at [email protected] or 212-224-3675.

Documents

Fooled by Compounding Jpm 2012.38.2