A HYBRID BAYESIAN AND DATA-ENVELOPMENT-ANALYSIS … · quantified as a value-at-risk (VaR) inferred from the probability density function of price on day 90 (i.e., PDF. 90IPOI). This

A HYBRID BAYESIAN AND DATA-ENVELOPMENT-ANALYSIS-BASED

APPROACH TO MEASURE THE SHORT-TERM RISK OF

INITIAL PUBLIC OFFERINGS

by

Shabnam Sorkhi

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Center for Management of Technology and Entrepreneurship Department of Chemical Engineering and Applied Chemistry

University of Toronto

© Copyright by Shabnam Sorkhi 2015

ii

A HYBRID BAYESIAN AND DATA-ENVELOPMENT-ANALYSIS-BASED

APPROACH TO MEASURE THE SHORT-TERM RISK OF

INITIAL PUBLIC OFFERINGS Shabnam Sorkhi

Doctor of Philosophy

Center for Management of Technology and Entrepreneurship Department of Chemical Engineering and Applied Chemistry

University of Toronto

2015

ABSTRACT

Initial public offerings (IPOs) are perhaps the most exhilarating events on stock

exchanges. Yet, the ‘ambiguity’ of the risk of IPOs overshadows the thrill and deters many

investors from possibly considering the IPOs. It is the insufficient accounting and market

history at the IPO stage that burdens their proper risk quantification. The main objective

pursued by this thesis is to offer a methodology for measuring the short-term risk of IPOs

which conforms to the mathematical principles of market risk analysis employed in the case

of public companies. Here, short-term risk is defined as the uncertainty associated with the

stock price of the IPO of interest (IPOI) 90 days subsequent to the issuing day and is

quantified as a value-at-risk (VaR) inferred from the probability density function of price on

day 90 (i.e., PDF90IPOI). This thesis develops a Bayesian framework where PDF90IPOI can be

estimated in a recursive and iterative process. In most IPO cases, there exist limited hard data,

iii

yet, strong ‘prior’ belief (soft data). The Bayesian setting offers a unique risk quantification

approach which befits and serves these two characteristics of IPOs.

To obtain the data required for carrying out the risk analysis, this research relies upon the

‘closest comparable’ of IPOI. The ‘closest comparable’ would be a public firm whose pre-

IPO idiosyncratic financial data most resemble those of IPOI. Furthermore, it is expected to

have gone public in similar macro-economic and sector conditions. Concisely, the risk

quantification process involves two phases: In Phase I, a Data Envelopment Analysis-based

multi-dimensional similarity metric is developed to select the closest comparable. Phase I

indeed identifies the most suitable source of ‘prior’ knowledge and passes the output to Phase

II which encompasses the Bayesian process. Phase II is designed to formulate and refine the

‘prior’ evidence and then employ the achieved ‘posterior’ knowledge towards estimating

PDF90IPOI. This PDF90IPOI subsequently acts as the basis for VaR inferences. In the last stage of

the research, the proposed Bayesian VaR methodology is examined (backtested) using the

following two tests: test of uniform cumulative probability values and test of VaR break

frequency.

iv

EXECUTIVE SUMMARY

This thesis proposes a methodology to quantify the short-term risk of investing in initial

public offerings (IPOs). Here, the risk of an IPO firm of interest (IPOI) is defined as a value-

at-risk measure computed based on the distribution of the IPOI’s stock price at the time point

of 90 trading days subsequent to the issuing day (i.e., PDF90IPOI). Limited accounting

information and/or short firm history at the IPO time poses a challenge in deriving the

parameters required to estimate PDF90IPOI. To mitigate this problem and to compensate for the

scarcity of data, this research incorporates relevant historical data from the past IPOs which

were the most comparable to the IPOI when they went public.

Finding comparables introduces another challenge to the risk analysis; selecting

comparables would require defining a ‘similarity’ metric (i.e., ‘distance’ metric) between the

IPOI and the respective candidates. In the multi-dimensional space of operating and financial

characteristics, each firm can be represented by one point. Yet, a simple ‘Euclidean’ distance

cannot be used to quantify the ‘similarity’ due to the heterogeneity of the dimensions and

units. Motivated by Simak’s (2000) [133] work, this thesis develops a methodology which uses

the framework of data envelopment analysis (DEA) to algorithmically and objectively select

comparables.

In working out the details of the method of comparable selection, this research sets forth a

novel model which augments the literature of ‘super-efficiency’ in DEA. This non-oriented

and non-radial model has been developed in the framework of ‘range directional’ models

(Silva Portela et al., 2004 [132]) and is unique by virtue of simultaneously holding all the

following characteristics: (i) it is unit invariant, (ii) it is translation invariant and has been

designed to maintain the integrity of any input or output array which is composed of positive

and negative elements; that is, the model does not decompose the respective mix into its

negative and non-negative components, and hence, precludes any complication imposed by

splitting an input or output vector, (iii) Its objective function is structured to minimize the

distance between the efficient DMU and the ‘reference’ point, in terms of both inputs and

v

outputs; the reference point is the aggregate of the coordinates which serve as the basis of

super-efficiency measurements. In addition, the objective function is designed such that it

conforms to the conventional interpretation of the super-efficiency score, (iv) as pursued in

this thesis, by introducing further constraints, the model can be enhanced to accommodate

‘non-discretionary’ factors, as well.

The comparable selection step indicated above forms the first phase of this thesis (i.e.,

Phase I). Once the comparables of a given IPOI are identified, the next phase (i.e., Phase II)

employs the Bayesian framework to find the distribution of the IPOI’s stock price at the 90th

day of trading (i.e., PDF90IPOI). It is assumed that during this period, the IPOI’s price follows the

stochastic process of GBM. If the parameters underlying this stochastic process were given,

the methodology could solve for PDF90IPOI, which would be a log-normal distribution.

However, these parameters are not fixed (i.e., they are random). Therefore, this research draws

on their joint distribution to estimate PDF90IPOI. The joint distribution of the parameters for any

given IPOI is approximated by the posterior joint distribution of its closest comparable.

The distance function introduced in Phase I incorporates a parameter (𝛼𝛼) whose value is

objectively estimated by backtesting, using a recursive-cyclic algorithm. At each cycle, a value

is assigned to 𝛼𝛼. Next, Phase I and Phase II are applied to a comprehensive set of IPOIs (276

IPOIs in this thesis). Using the Bayesian framework, the algorithm subsequently solves for the

likelihood of the given 𝛼𝛼, conditioned on the realized prices on day 90 for the participating

IPOIs (276 IPOIs). Using different values of 𝛼𝛼, the above procedure is iterated in various

cycles. The optimum value of 𝛼𝛼 is selected as the one which leads to the maximum likelihood

across all the cycles.

Once the maximum likelihood estimate of 𝛼𝛼 is achieved (i.e., 𝛼𝛼 = 1.5), the

estimated PDF90IPOI’s using this value of 𝛼𝛼 undergo an extensive accuracy scrutiny, by means of

the following two tests: (i) test of uniform cumulative probability values, (ii) test of VaR break

frequency. The outcomes of both tests attest to the validity and predictive utility of the

proposed methodology and offer the yielded PDF90IPOI’s as reliable bases for deriving VaR

inferences.

vi

ACKNOWLEDGMENTS

Over the long journey of building this thesis, I eagerly awaited a day that I would write

these lines where I can formally extend my most sincere gratitude to the people who helped

me succeed in this endeavour.

First and foremost, I wish to profoundly thank the members of my PhD committee: Prof.

Roy H. Kwon, Prof. Yuri A. Lawryshyn, and my PhD advisor, Prof. Joseph C. Paradi. It

would not have been possible to achieve this thesis without their extraordinary collaboration.

Their insightful suggestions helped me organize and systematically evolve my initial ideas in

the most efficient manner. I am also grateful to them for the time and effort they dedicated to

carefully review this thesis.

I must extend a special note of gratitude to my advisor, Prof. Joseph C. Paradi. Through his

courtesy and consideration, I had all the essential resources at my disposal. But beyond that,

he is an intelligent and forthright life mentor. I am thankful to him for having saved me from

learning many life lessons the intensive and unpleasant way.

I would like to genuinely thank Mr. Eric Kang, the assistant manager at the Finance Lab of

Rotman School of Management. His patience has no limit. He taught me the fundamentals of

many financial databases and never tired of addressing my most fastidious queries.

I wish to record my appreciation to Dr. Xiaopeng Yang with whom I was fortunate to share

office space. Our friendly brainstorming sessions sparked a nucleus idea around which the

framework of ‘Phase I’ gradually evolved. Above all, I will cherish our enjoyable

conversations, restaurant adventures, and his pastry treats.

I must express my gracious thanks to Ms. Leticia Gutierrez and Ms. Gorette Silva for

making my term at the department such a pleasant one. Their warm demeanor shall forever

last as a treasure in my memory.

vii

Certainly, this acknowledgment would have been incomplete without extending heartfelt

thanks to my family. I wish to offer my deepest appreciation to all my family members,

particularly my parents, Mahin and Jafar. I was privileged to be born as the daughter of such

intellectual and affectionate parents who institutionalized the value of education in me at very

early stages of my life. I am forever in their debt for giving me life and for their subsequent

sacrifices and love.

This research was supported by Ontario Graduate Scholarship, Queen Elizabeth II

Graduate Scholarships in Science & Technology, and grants to the Center for Management of

Technology and Entrepreneurship from the Financial Services Industry.

viii

TABLE OF CONTENTS

TABLE OF CONTENTS .................................................................................... VIII

LIST OF TABLES .............................................................................................. XIV

LIST OF APPENDIX TABLES ............................................................................. XV

LIST OF FIGURES ............................................................................................ XVI

LIST OF APPENDIX FIGURES........................................................................... XXI

LIST OF ABBREVIATIONS ............................................................................. XXIII

NOMENCLATURE ......................................................................................... XXVI

CHAPTER 1. INTRODUCTION ........................................................................... 1

1.1. Motivation .................................................................................................................1

1.2. Contributions and Challenges ....................................................................................2

1.3. Road Map ..................................................................................................................4

1.3.1. Phase I ................................................................................................................5

1.3.2. Phase II. ..............................................................................................................7

1.3.3. Calibration of the Distance Function .................................................................7

1.4. Organization of the Document ..................................................................................8

CHAPTER 2. LITERATURE REVIEW ON IPO ................................................. 10

2.1. Introduction .............................................................................................................10

2.2. Initial Public Offering Motives................................................................................10

2.3. IPO Timing ..............................................................................................................12

2.4. IPO Mechanisms......................................................................................................13

ix

2.4.1. Book-Building (Firm Commitment) ................................................................13

2.4.2. Fixed Price Mechanism ....................................................................................13

2.4.3. Auctions ...........................................................................................................14

2.4.4. Prevailing Mechanism: Book-Building ............................................................14

2.5. Post-IPO Trading .....................................................................................................15

2.5.1. Difference between Offer Price and First-Day Opening Price ........................15

2.5.2. Post-IPO Underwriter Services ........................................................................16

2.6. Underpricing ............................................................................................................17

2.7. Post-IPO Long-Term Firm Performance .................................................................18

2.8. IPO Valuation ..........................................................................................................19

2.8.1. Discounted Free Cash Flows ............................................................................20

2.8.2. Comparable Multiples ......................................................................................22

2.8.3. Other Techniques .............................................................................................23

2.8.3.1. Asset-Oriented Techniques ........................................................................23

2.8.3.2. Real Options ...............................................................................................23

CHAPTER 3. LITERATURE REVIEW ON METHODOLOGY: COMPARABLE-

BASED METHODS ................................................................................................. 25

3.1. Introduction .............................................................................................................25

3.2. Comparable-Based Risk Analysis ...........................................................................25

3.2.1. Comparable-Based Method - More Reflective of Market Conditions .............26

3.2.2. Discounting Methods - Theoretically Sound but Practically Rigorous ............28

3.2.3. Discounting Methods - Added Complexity with Uncertain Accuracy Gain ...29

3.2.3.1. Evidence from Non-IPO Cases ..................................................................30

3.2.3.2. Evidence from French IPOs .......................................................................32

x

3.2.4. Comparable-Based Method: Prevalent in the U.S. ..........................................33

3.3. Comparable Firm Selection .....................................................................................35

3.4. Factors Impacting the IPO Price ..............................................................................47

CHAPTER 4. LITERATURE REVIEW ON METHODOLOGY: DATA

ENVELOPMENT ANALYSIS ................................................................................... 53

4.1. Introduction .............................................................................................................53

4.2. Basic DEA Models ..................................................................................................55

4.2.1. CCR Model ......................................................................................................55

4.2.2. BCC Model ......................................................................................................60

4.2.3. Additive Model ................................................................................................61

4.2.4. Slacks-Based Measure of Efficiency ................................................................62

4.3. Negative Inputs or Outputs ......................................................................................64

4.3.1. Translating Data ...............................................................................................64

4.3.2. Treating Negative Outputs (Inputs) as Positive Inputs (Outputs) ....................67

4.3.3. Semi-Oriented Radial Measure ........................................................................68

4.3.4. Variant of Radial Measure ...............................................................................72

4.3.5. Range Directional Models ................................................................................74

4.4. Non-Discretionary Variables ...................................................................................79

CHAPTER 5. LITERATURE REVIEW ON METHODOLOGY: BAYESIAN

INFERENCE IN RISK ASSESSMENT ....................................................................... 88

5.1. Introduction .............................................................................................................88

5.2. Bayesian versus Frequentist Statistics .....................................................................90

5.3. Basics of Bayesian Inference ...................................................................................93

xi

5.4. Merits of the Bayesian Methodology ......................................................................96

5.5. Merits of the Bayesian Approach in Risk Assessment ..........................................101

CHAPTER 6. METHODOLOGY...................................................................... 108

6.1. Review of Objectives .............................................................................................108

6.2. Phase I: Comparable Selection ..............................................................................108

6.2.1. Pool of Candidates .........................................................................................114

6.2.2. Variable Selection ..........................................................................................115

6.2.3. The DEA Model .............................................................................................121

6.2.3.1. Negative Data ...........................................................................................122

6.2.3.2. Non-Discretionary Factors .......................................................................127

6.2.4. Efficient IPOI Treatment ................................................................................133

6.2.5. Algorithm of Phase I ......................................................................................138

6.3. Phase II: Assessment of Short-Term Risk .............................................................146

6.3.1. Stock Pricing Model .......................................................................................146

6.3.2. Distribution of Stock Price 90 Days after the Issuing Day ............................148

6.3.2.1. Estimating the Joint Distribution of the Parameters .................................151

6.3.2.1.1. Estimating the Posterior Distributions ....................................................................... 152

6.3.2.1.2. Estimating the Prior Distributions .............................................................................. 155

6.4. Calibrating the Distance Equation .........................................................................157

CHAPTER 7. DATA ....................................................................................... 161

7.1. Fundamental Financial Data ..................................................................................161

7.2. Sector Index and GDP ...........................................................................................164

CHAPTER 8. RESULTS ................................................................................. 166

xii

8.1. Results of Phase I: Comparable Selection .............................................................166

8.2. Results of Phase II: Assessment of Short-Term Risk ............................................175

8.2.1. Review of the Methodology of Phase II.........................................................175

8.2.2. Uniform Prior Joint Density ...........................................................................178

8.2.2.1. Estimating RΜ and RΣ ..............................................................................179

8.2.2.1.1. Maximum Likelihood Estimation Method ................................................................. 184

8.2.2.2. Estimating RΞ and RΩ. .............................................................................186

8.2.3. Calibrating the Distance Function ..................................................................191

8.2.4. Scrutiny of Estimated Probability Density Functions ....................................203

8.2.4.1. Test of Uniform Cumulative Probabilities ...............................................203

8.2.4.1.1. Non-Parametric Statistical Tests to Examine the Uniformity Assumption ................ 207

8.2.4.2. Test of VaR Break Frequency ..................................................................211

8.2.4.3. Test of Impact of Comparable Selection and Bayesian Updating on

Estimated PDFs .............................................................................................................212

CHAPTER 9. CONCLUSIONS AND FUTURE WORK ....................................... 219

9.1. In Conclusion: IPO Risk Analysis Unravelled Using the Bayesian Perspective ..219

9.2. Future Research Directions and Application Prospects ........................................225

9.2.1. Extension of Applications ..............................................................................225

9.2.1.1. Risk Measurement of Portfolios Containing IPOs ...................................225

9.2.1.2. Hedging the IPO Investment Risk ............................................................230

9.2.1.3. IPO VaR Decomposition ..........................................................................231

9.2.2. Future Research: Incorporating Information from Other Comparables .........231

Direction 1. Virtual Firm ..........................................................................................232

Direction 2. Compound Posterior Joint Density ......................................................234

xiii

REFERENCES .................................................................................................. 237

APPENDIX A. COMPLEMENTARY DETAILS ON THE CALIBRATION OF THE

DISTANCE FUNCTION......................................................................................... 253

A.1 Introduction ...........................................................................................................253

A.2 Supplementary Discussion on Section 8.2.4: Scrutiny of Estimated PDF90IPOI’s

under other Values of 𝛼𝛼 ........................................................................................................253

A.3 Behaviour of Log-Likelihood Functions of 𝛼𝛼 .......................................................260

xiv

LIST OF TABLES

Table 6-1. Input-Output Data Used by the CCR Model in an Input-Oriented Analysis .........110

Table 7-1. The table presents the number of remaining IPOs under each S&P sector after

completing the process of data mining. The third and fifth columns display the time

periods spanned by the respective IPOs. .........................................................................163

Table 7-2. Each cell displays the time period spanned by the respective sector and index in

COMPUSTAT. ................................................................................................................165

Table 8-1. For each nominal frequency (ῤ) on the leftmost column, the rightmost counterpart

presents the realized frequency which is computed as the quotient of the middle column

over the total number of the IPOIs (i.e., 276). .................................................................213

Table 8-2. This table can be regarded as the counterpart of Table 8-1. The latter reports the

realized VaR break frequencies under the ‘comprehensive’ methodology; whereas, this

table exhibits the outcomes of the test of VaR break frequency for the ‘simplified’

methodology. ...................................................................................................................217

xv

LIST OF APPENDIX TABLES

Table A-1. The table reports the individual K-S p-values computed under different

assumptions of 𝛼𝛼. For a given 𝛼𝛼, the null hypothesis can be stated as ‘the observed set of

276 CPVs has been drawn from a reference population with uniform density between 0

and 1’. Results indicate that in the cases of 0, 0.5, 1.9, and ≥ 3, there is no strong

evidence in favour of the null hypothesis, which suggests that the respective observed

sample set would be considered a rare event if one assumed that the null hypothesis is

valid. It can, therefore, be concluded that the individual sets of PDF90IPOI’s produced using

these 𝛼𝛼 values have failed to fulfil the objectives of the test of uniform CPVs, which

implies that no convincing evidence exists to support the standalone accuracy of each of

these sets. .........................................................................................................................254

Table A-2. Each cell presents the realized VaR break frequency under the corresponding 𝛼𝛼

and nominal frequency of VaR breaks. The exhibited frequencies can be interpreted in a

similar fashion to those displayed in Table 8-1 (see, Section 8.2.4.2). ...........................255

xvi

LIST OF FIGURES

Figure 1-1. This figure illustrates the main contributions of this thesis. .....................................3

Figure 6-1. This figure demonstrates the efficient hyperplanes determined by the constraints

presented in Equations (6.1) to (6.6). The shared feasible region lies above the efficient

hyperplanes. .....................................................................................................................112

Figure 6-2. This figure complements Figure 6-1 by adding the hyperplane of DMU A which is

represented by Equation (6.1). DMU A is an inefficient DMU, and thus, its hyperplane

lies below the hyperplanes of the efficient DMUs C, D, and E. The set of ‘efficient peers’

of DMU A comprises DMUs D and E. Note that the intersection of DMUs D and E is

where the hyperplane A has the closest proximity to the efficient frontier. DEA,

therefore, determines similar final weights for DMUs D, E, and A. ...............................113

Figure 6-3. A Snapshot of the Pool of Candidates for a Given IPOI in the ‘Health Care’ Sector.

.........................................................................................................................................116

Figure 6-4. General Layout of the DEA Model .......................................................................118

Figure 6-5. A Demonstration of Several Potential Improvement Directions for DMU E .......130

Figure 6-6. This flowchart presents the sequence of steps involved in Phase I of the analysis.

The MATLAB programming software is used to implement all the steps. The written

‘functions’ are classified in three different libraries: (i) Data Mining and Data Processing,

(ii) DEA Models, (iii) Analysis. In total, the libraries comprise approximately 7000 lines

of script. ...........................................................................................................................141

Figure 6-7. The outlier detection algorithm of De Sousa and Stošić (2005) [45] is implemented

as demonstrated by this flowchart. ..................................................................................143

xvii

Figure 6-8. This flowchart outlines the process of computing ‘threshold’, indicated in Figure

6-7. ...................................................................................................................................145

Figure 6-9. The figure presents the histogram of the ratio of ‘closing price of the first day to

offer price’ for 327 IPOs in the ‘health care’ sector. The IPOs span the time period of

1990-2012. .......................................................................................................................149

Figure 6-10. Using the data of the ‘health care’ sector (Sector 35 under GICS), this figure

provides an overview of the IPOIs and their ‘pools of candidates’. Each column presents

the IPOs used to carry out Phase I and Phase II for the respective IPOI. The reader is

referred to Section 6.2.1 for further information on the pool of candidates. The first 51

IPOs are referred to as ‘pioneers’. ...................................................................................153

Figure 6-11. This flowchart provides an overview of the process of calibrating the ‘distance

function’ and estimating the optimum value for 𝛼𝛼 (see, Equation (6.37)). .....................160

Figure 8-1. The bar plots visualize the number of comparables associated with the respective

IPOIs whose IDs are presented on the horizontal axis. ...................................................168

Figure 8-2. This figure depicts the efficient frontier for the data presented in Table 1.5 of

Cooper et al. (2007) [41]. Note that the output ‘Inpatients’ has been eliminated which

facilitates visualizing the frontier in 3 dimensions. It is illustrated how the shape of the

frontier changes by the exclusion of the efficient DMU G. The production possibility set,

which is capped by the efficient frontier, spans a smaller space subsequent to the removal

of the efficient DMU G. The segmentation of the frontier changes as well; the number of

efficient hyperplanes decreases in this example. .............................................................170

Figure 8-3. This figure demonstrates the ‘average-union’ ratio for individual IPOs. .............172

Figure 8-4. This figure provides an overview of the changes in the composition of the

comparables for the 52nd IPO. The horizontal axis represents the number of candidates

xviii

(i.e., the IPOs participating in the respective analysis of comparables). The vertical axis

exhibits the union of all the sets of comparables obtained for the 52nd IPO using the

candidate sets of different sizes. Under a given set of candidates, the selected

comparables are colored in red. .......................................................................................174

Figure 8-5. This figure visualizes the process of determining RΜ and RΣ. .............................181

Figure 8-6. The graph visualizes the estimated PDF90IPOI for the 276th IPOI (i.e., 327th IPO).

The dashed line marks the observed price on day 90 (𝑣𝑣90realized) normalized by the

respective inflation-adjusted offer price (𝑣𝑣offer). Its position is an indicative of the

probability that would be associated with the event 𝑣𝑣90realized/𝑣𝑣offer ≈ 1.015 if the

proposed methodology of this thesis is employed and if 𝛼𝛼 is set equal to 1.5. A similar

process is undertaken for each IPOI in order to achieve the value of the yielded PDF90IPOI

at the respective 𝑣𝑣90realized/𝑣𝑣offer. These values are recorded to be subsequently utilized in

determining the likelihood of 𝛼𝛼 = 1.5 following the steps described within the text. ....192

Figure 8-7. This graph exhibits the log-likelihood function of 𝛼𝛼. Each depicted point is

obtained by executing the following two main steps: Step 1 assigns a value to 𝛼𝛼 and

executes phase I and phase II for each of the 276 IPOIs which results in 276

individual PDF90IPOI’s under the postulated 𝛼𝛼. Step 2 computes the logarithm of the

product of 276 elements each representing the value of the PDF90IPOI of a given IPOI at the

respective realized 𝑉𝑉90. The likelihood function hits its maximum at 𝛼𝛼 ≅ 1.5. ..............194

Figure 8-8. The graph visualizes the two PDF90IPOI’s estimated for the 276th (last) IPOI under

the two assumptions of 𝛼𝛼 = 0 and 𝛼𝛼 = 1.5. A narrower PDF is achieved with the

maximum likelihood estimate of 𝛼𝛼 (𝛼𝛼 = 1.5), which is more desirable since it indicates

less variance and more certainty about the value of 𝑉𝑉90. .................................................197

Figure 8-9. Similar to Figure 8-8, this plot visualizes the two estimated PDF90IPOI’s for the 17th

IPOI when 𝛼𝛼 = 0 and 𝛼𝛼 = 1.5. The graph serves as an example of the case where the

xix

maximum likelihood estimate of the parameter (𝛼𝛼 = 1.5) has produced a narrower PDF;

yet, the mode of the PDF is comparatively remote from the realized price, marked by the

dashed line. Such an observation should not be viewed as an element of concern about

the accuracy of the PDF since the concept of maximum likelihood rests the basis of

accuracy comparison upon examining the assigned probability to the realized event, and

not upon the deviation of the realized event from the most likely value (mode) of the

PDF. .................................................................................................................................198

Figure 8-10. This graph exhibits the outcomes of the calibration process for varying sets of

engaged IPOs. As displayed on the horizontal axis, the set producing the first point on the

plot consists of the 52nd IPO, only. The second point on the figure is generated using the

set comprising the 52nd and 53rd IPOs. In a like fashion, the remaining points are obtained

by sequentially increasing the size of the set of participating IPOs by one. For each set,

the maximum likelihood estimate of 𝛼𝛼 is found through repeating the maximum

likelihood calibration process described in the text. As a case in point, consider the label

52-100, on the horizontal axis; it implies that the corresponding calibration process was

carried out on the basis of the PDF90IPOI’s estimated for the 52nd IPO to the 100th IPO, or

equivalently, the 1st IPOI to the 49th IPOI, under the individual 𝛼𝛼 values depicted on the

vertical axis. For this particular set of length 49, the likelihood function reaches its

maximum at the value of 𝛼𝛼 = 1.25; that is, the maximum likelihood estimate of 𝛼𝛼 is

equal to 1.25. This figure demonstrates that despite early fluctuations in the value of the

maximum likelihood estimate of 𝛼𝛼, the trend remains steady at 𝛼𝛼 = 1.5 once the number

of participating IPOs rises beyond a certain threshold (i.e., 148). ...................................202

Figure 8-11. The figure presents the histogram of the CPVs computed for the 276 IPOIs (i.e.,

52nd IPO to 327th IPO), using Equation (8.9). ..................................................................209

Figure 8-12. This figure is produced following the same steps used for Figure 8-11. The only

difference remains the underlying methodology which generates the CPVs. Figure 8-11

depicts the histogram of the CPVs computed using the ‘comprehensive’ methodology;

xx

whereas, here, the graph is formed on the basis of the CPVs resulted from the ‘simplified’

methodology. ...................................................................................................................216

Figure 9-1. The figure illustrates a one-input and one-output DEA example with a variable

returns to scale frontier of efficiency. Firm G represents the firm of interest whose

comparables are encircled in grey in the upper graph. The lower graph depicts the convex

hull of the set of comparables as well as the ‘virtual’ firm which resides on the boundary

of the convex hull and is designated by Firm K. .............................................................233

xxi

LIST OF APPENDIX FIGURES

Figure A-1. The histogram is built using 276 CPVs which are computed on the basis of the

respective set of 276 PDF90IPOI’s, estimated under the assumption of 𝛼𝛼 = 0. ...................257


respective set of 276 PDF90IPOI’s, estimated under the assumption of 𝛼𝛼 = 0.5. ................257


respective set of 276 PDF90IPOI’s, estimated under the assumption of 𝛼𝛼 = 1.25. ..............258


respective set of 276 PDF90IPOI’s, estimated under the assumption of 𝛼𝛼 = 1.9. ................258


respective set of 276 PDF90IPOI’s, estimated under the assumption of 𝛼𝛼 ≥ 1.9. Recall from

Section 8.2.3 that starting at the vicinity of 𝛼𝛼 = 3, the log-likelihood function stabilizes

which indicates that the second term (i.e., the 𝛼𝛼 term) has dominated the first term of the

similarity metric in all the 276 IPOI cases. Therefore, the individual sets of PDF90IPOI’s

estimated from this point onward remain unchanged, which consequently translates into

invariant sets of CPVs, and hence, identical CPV histograms. .......................................259

Figure A-6. The graph demonstrates the log-likelihood function of 𝛼𝛼 computed on the basis of

the subset consisting of the 52nd to 141st IPOs. This figure serves as an example of the

‘first type’ of behaviour described on Page 260. Similar behaviours have been recorded

for the two succeeding subsets (i.e., 52 to 142 and 52 to 143). .......................................262


the subset consisting of the 52nd to 144th IPOs. This figure serves as an example of the

xxii

‘second type’ of behaviour described on Page 260. A similar behaviour has been

observed for the set of the 52nd to 145th IPOs. .................................................................262


the subset consisting of the 52nd to 146th IPOs. This figure serves as an example of the

‘third type’ of behaviour described on Page 260. Similar behaviours have been recorded

for the three succeeding subsets (i.e., 52 to 147, 52 to 148, and 52 to 149). ...................263

Figure A-9. The graph exhibits the log-likelihood function of 𝛼𝛼 computed on the basis of the

subset consisting of the 52nd to 150th IPOs. More details regarding this plot are provided

on Page 261. .....................................................................................................................263

xxiii

LIST OF ABBREVIATIONS

BAM Bounded Adjusted Measure of Efficiency

BCC DEA Model of Banker, Charnes, and

Cooper

CCR DEA Model of Charnes, Cooper, and

Rhodes

CDF Cumulative Density Function

CDF-1 Inverse Cumulative Density Function

CFO Chief Financial Officer

CI Confidence Interval

CPI Consumer Price Index

CPV Cumulative Probability Value

CRLB Cramer-Rao Lower Bound

CRS Constant Returns to Scale

CUSIP Committee on Uniform Security

Identification Procedures

DCF Discounted Cash Flow

DDM Dividend Discount Model

DEA Data Envelopment Analysis

DMU Decision Making Unit

xxiv

DT Decision Tree

EBITDA Earnings Before Interest, Tax,

Depreciation, and Amortization

EVA Economic Value Added

EVS Enterprise Value-to-Sales Multiple

GBM Geometric Brownian Motion

GDP Gross Domestic Product

GICS Global Industry Classification Standard

i.i.d independent and identically distributed

IPO Initial Public Offering

IPOI IPO Firm of Interest

K-S Kolmogorov-Smirnov (Test)

LHS Left-Hand Side (of an Equation)

MLE Maximum Likelihood Estimation

MSBM Modified Slacks-Based Measure

ND Non-Discretionary

NPV Net Present Value

PB Price-to-Book Multiple

PDF Probability Density Function

𝐏𝐏𝐏𝐏𝐏𝐏𝟗𝟗𝟗𝟗𝐈𝐈𝐏𝐏𝐈𝐈𝐈𝐈 PDF of Stock Price on Day 90 Associated

with the IPO Firm under Scrutiny (IPOI)

xxv

𝐏𝐏𝐏𝐏𝐏𝐏𝟗𝟗𝟗𝟗𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏𝐏 PDF of Portfolio Value 90 Days After Its

Formation

P/E Price-to-Earnings Multiple

PPE Property, Plant and Equipment

R&D Research and Development

RDM Range Directional Model of Silva Portela et

al. (2004) [132]

RE Retained Earnings

RHS Right-Hand Side (of an Equation)

RIM Residual Income Model

ROA Real Option Analysis

ROE Return on Equity

SBM Slacks-Based Measure of Efficiency

SDC Securities Data Company PlatinumTM

SDC Stochastic Differential Equation

SEC Securities and Exchange Commission

SN Standard Normal PDF

SORM Semi-Oriented Radial Measure

VaR Value at Risk

VRM Variant of Radial Measure

VRS Variable Returns to Scale

xxvi

WRDS Wharton Research Data Services

NOMENCLATURE

𝑫𝑫𝒄𝒄𝒄𝒄 Distance between the IPOI under Scrutiny, Identified by

the ID of 𝑏𝑏, and the Corresponding Comparable Firm 𝑐𝑐

𝒆𝒆 Vector of 1’s with a Dimension Matching Its Multiplier

𝒇𝒇𝑽𝑽𝟗𝟗𝟗𝟗(𝒗𝒗𝟗𝟗𝟗𝟗) Unconditional PDF of 𝑉𝑉90, Symbolizing PDF90IPOI

𝒇𝒇𝑽𝑽𝟗𝟗𝟗𝟗𝐏𝐏 �𝒗𝒗𝟗𝟗𝟗𝟗𝐏𝐏 � Unconditional PDF of 𝑉𝑉90P , Symbolizing PDF90Portfolio

𝒇𝒇𝚳𝚳,𝚺𝚺,𝚵𝚵,𝛀𝛀(𝝁𝝁,𝝈𝝈, 𝝃𝝃,𝝎𝝎) Joint PDF of the Bayesian Random Variables Μ, Σ, Ξ,

and Ω

𝒇𝒇𝚬𝚬𝐍𝐍(𝝐𝝐;𝟗𝟗,𝟏𝟏) Standard Normal PDF: 𝜖𝜖 represents a draw from the

standard normal random variable Ε.

𝒇𝒇𝑼𝑼𝐏𝐏𝐏𝐏𝐥𝐥𝐍𝐍(𝒖𝒖; 𝝃𝝃,𝝎𝝎)

Log-Normal PDF of the Random Variable 𝑈𝑈, with the

Corresponding Normal PDF N(𝜉𝜉,𝜔𝜔)

𝒈𝒈𝒙𝒙𝒊𝒊𝒊𝒊 Range of Possible Improvement for the 𝑖𝑖th Input of the 𝑗𝑗th

DMU, in the Range Directional Model of Silva Portela et

al. (2004) [132]

𝒈𝒈𝒚𝒚𝒓𝒓𝒊𝒊 Range of Possible Improvement for the 𝑟𝑟th Output of the

𝑗𝑗th DMU, in the Range Directional Model of Silva

Portela et al. (2004) [132]

𝓛𝓛𝑽𝑽𝟗𝟗𝟗𝟗(𝒗𝒗𝟗𝟗𝟗𝟗|𝝁𝝁,𝝈𝝈,𝒗𝒗𝟏𝟏) Conditional PDF of 𝑉𝑉90, Given the Event of (Μ ≈ 𝜇𝜇,

Σ ≈ 𝜎𝜎, 𝑉𝑉1 ≈ 𝑣𝑣1)

ῤ Nominal Frequency of VaR Breaks

xxvii

𝑹𝑹𝒙𝒙𝒊𝒊𝒊𝒊

Equivalent of 𝑔𝑔𝑥𝑥𝑖𝑖𝑖𝑖, Employed in the Modified Slacks-

Based Measure of Sharp et al. (2007) [131]

𝑹𝑹𝒚𝒚𝒓𝒓𝒊𝒊 Equivalent of 𝑔𝑔𝑦𝑦𝑟𝑟𝑟𝑟, Employed in the Modified Slacks-

Based Measure of Sharp et al. (2007) [131]

𝐑𝐑𝚳𝚳

An Interval Reflecting the Prior Evidence on the Range

of Typical Values for Short-Term Expected Daily Rate

of Return

𝐑𝐑𝚺𝚺 An Interval Reflecting the Prior Evidence on the Range

of Typical Values for Short-Term Daily Volatility

𝐑𝐑𝚵𝚵 An Interval Reflecting the Prior Evidence on the Ξ Range

𝐑𝐑𝛀𝛀 An Interval Reflecting the Prior Evidence on the Ω

Range

𝒔𝒔− Input Slack Vector of a Given DMU

𝒔𝒔+ Output Slack Vector of a Given DMU

𝒕𝒕 Time

𝑼𝑼

A Random Variable Denoting the Ratio of Closing Price

of the First Trading Day (𝑉𝑉1) to Offer Price (𝑣𝑣offer), for a

Given IPO

𝒖𝒖 A Realization of the Random Variable 𝑈𝑈

𝑽𝑽𝒕𝒕 A Random Variable Denoting the Stock Price at Time 𝑡𝑡

𝒗𝒗𝒕𝒕 A ‘Realized’ Stock Price at Time 𝑡𝑡

𝑽𝑽𝟏𝟏 A Random Variable Denoting the Closing Price of the

xxviii

First Trading Day, for a Given IPO

𝒗𝒗𝟏𝟏 A Realization of the Random Variable 𝑉𝑉1

𝑽𝑽𝟗𝟗𝟗𝟗 A Given IPOI’s Price at the Time Point of 90 Trading

Days after the Issuing Day

𝒗𝒗𝟗𝟗𝟗𝟗 A Realization of the Random Variable 𝑉𝑉90

𝑽𝑽𝟗𝟗𝟗𝟗𝐏𝐏

Price of a Portfolio Containing the Stocks of a Given

IPO, at the Time Point of 90 Trading Days after the

Issuing Day of the IPO

𝒗𝒗𝟗𝟗𝟗𝟗𝐏𝐏 A Realization of the Random Variable 𝑉𝑉90P

𝒗𝒗𝐏𝐏𝐏𝐏𝐏𝐏𝐨𝐨𝐏𝐏 Offer Price of a Given IPO

𝒘𝒘 Weight of an Asset in a Given Portfolio

𝒙𝒙 Input Vector of a Given DMU

𝑿𝑿 Matrix of DEA Inputs

𝒚𝒚 Output Vector of a Given DMU

𝒀𝒀 Matrix of DEA Outputs

𝒁𝒁 A Variable Following a Wiener Process

Greek Symbols

𝜶𝜶 Coefficient of the Second Term (i.e., the 𝜓𝜓 Term) in the

Distance Function (Equation (6.37))

𝜷𝜷 Radial Measure of Inefficiency in the VRM Model

xxix

𝜹𝜹 𝜆𝜆 Difference Indicator

𝚬𝚬 A Random Variable with a Standard Normal Distribution

𝝐𝝐 A Realization of the Random Variable Ε

𝜼𝜼 Number of Comparables of a Given IPOI

𝜽𝜽 Radial Efficiency Score of BCC or CCR

𝝀𝝀 Non-Negative Weight of a DMU in the Envelopment

Form of DEA

𝚳𝚳 A Bayesian Random Variable Denoting the Expected

Daily Rate of Return of a Given Stock

𝝁𝝁

A ‘Realized’ Expected Daily Rate of Return: It is a

constant representing a draw from the random variable

Μ.

𝝁𝝁�𝑳𝑳 Lower Bound of the Confidence Interval for 𝜇𝜇

𝝁𝝁�𝑼𝑼 Upper Bound of the Confidence Interval for 𝜇𝜇

𝚵𝚵

A Bayesian Random Variable Representing the Mean of

the Normal PDF Associated with the Log-Normal PDF

of 𝑈𝑈

𝝃𝝃 A Realization of the Random Variable Ξ

𝝃𝝃�𝑳𝑳 Lower Boundary of the Confidence Interval of 𝜉𝜉

𝝃𝝃�𝑼𝑼 Upper Boundary of the Confidence Interval of 𝜉𝜉

𝝆𝝆 A Slacks-Based Measure of Efficiency

𝚺𝚺 A Bayesian Random Variable Denoting the Daily

xxx

Volatility of a Given Stock

𝝈𝝈 A ‘Realized’ Daily Volatility: It is a constant

representing a draw from the random variable Σ.

𝝈𝝈�𝑳𝑳 Lower Boundary of the Confidence Interval of 𝜎𝜎

𝝈𝝈�𝑼𝑼 Upper Boundary of the Confidence Interval of 𝜎𝜎

𝝍𝝍 Efficiency Score of the DEA Optimization Problem

Developed in this Thesis

𝛀𝛀

A Bayesian Random Variable Representing the Standard

Deviation of the Normal PDF Associated with the Log-

Normal PDF of 𝑈𝑈

𝝎𝝎 A Realization of the Random Variable Ω

𝝎𝝎�𝑳𝑳 Lower Bound of the Confidence Interval of 𝜔𝜔

𝝎𝝎�𝑼𝑼 Upper Bound of the Confidence Interval of 𝜔𝜔

Functions and Operators

∈ Denotes Set Membership

● ∈ █ : ● is an element or a member of the set █.

|●| Absolute Value or Modulus of ●

Underbar Denoting the Vector Characteristic of

|||| ℓ2-Norm or Euclidean Norm of the Vector

⊙ Element-Wise Multiplication of Two Vectors

𝐝𝐝● Infinitesimal Change in the Variable ●

xxxi

𝚫𝚫● Non-Infinitesimal Change in the Variable ●

𝐈𝐈() An Indicator Function: It is assigned the value of 1 when

its condition, , is true and equals 0 if its argument

returns false.

ln(●) Natural Logarithm of ●

log(●) Logarithm of ● to Base 10

𝐂𝐂𝐏𝐏𝐏𝐏𝐒𝐒𝐍𝐍,Ƿ/𝟐𝟐−𝟏𝟏

Absolute Value of the Inverse Cumulative Density

Function (CDF−1) of the Standard Normal Distribution

(SN), at the Cumulative Probability of Ƿ/2

𝐂𝐂𝐏𝐏𝐏𝐏𝝌𝝌𝒏𝒏−𝟏𝟏𝟐𝟐 ,Ƿ/𝟐𝟐 −𝟏𝟏 Lower Ƿ/2 × 100 % Quantile of the Chi-Square

Distribution with the Degree of Freedom of (𝑛𝑛 − 1)

𝐂𝐂𝐏𝐏𝐏𝐏𝝌𝝌𝒏𝒏−𝟏𝟏𝟐𝟐 ,(𝟏𝟏−Ƿ/𝟐𝟐) −𝟏𝟏 Upper Ƿ/2 × 100 % Quantile of the Chi-Square

Distribution with the Degree of Freedom of (𝑛𝑛 − 1)

𝐍𝐍(,2) Normal Probability Density Function with Mean and

Variance 2

𝐏𝐏(❷|❶) Conditional Probability Statement: Probability of Event

❷ Given the Occurrence of Event ❶

𝐏𝐏�❷� Unconditional Probability of Event ❷

Notes on Notation

1. Random variables are denoted by uppercase letters.

2. A realization of a random variable is denoted by the corresponding

lowercase letter.

xxxii

3. For an arbitrary random variable Τ, P(Τ ≈ 𝜏𝜏) can, more precisely, be

expressed as P(𝜏𝜏 < Τ ≤ 𝜏𝜏 + d𝜏𝜏).

1

Chapter 1. INTRODUCTION

1.1. Motivation

Every year all over the world, Initial Public Offerings (IPOs) provide opportunity for many

significant private companies and relatively young firms to raise the capital required for future

growth, market expansion, and/or repayment of debt. Investing in IPOs is inherently risky. It

is, however, the ambiguity, not the quantity, of this risk that often hinders risk-seeking

investors from investing in IPOs. Hence, properly quantifying the risk associated with

investment in an IPO can attract many new investors to this market and let them contribute to

the expansion of smaller/younger firms.

A methodology for quantifying the risk of IPOs can benefit other parties engaged in an IPO

event, as well. In general, the ‘game’ of IPO involves two more ‘players’ besides investors:

the IPO company itself and the underwriter. Although not as distinctly recognized as in the

case of investors versus the IPO company, the incentives of the underwriter and the IPO

company are not perfectly aligned, either. Both parties seek to maximize the IPO proceeds;

yet, they face different sets of constraints. Under the ‘book-building’ (‘firm commitment’)

mechanism (see, Section 2.4), the IPO company would aim at the maximum attainable gain

with no litigation risk; whereas, the underwriter endeavours to maximize the proceeds within

the prescribed bounds of commitment to sell the entire stock inventory as well as enhancing

the expected value of its future share of the IPO market through maintaining its reputation

among investors and catering to potential future clients; that is, presently private firms which

may decide to go public. Evidently, all such constraints necessitate hedging against litigation

risk, as well. Thus, while the IPO company would intend to achieve zero ‘money left on the

table’, the underwriter might ‘underprice’ the issue, to a certain degree, to cause

oversubscription which would guarantee the inventory exhaustion (see, Sections 2.4 and 2.6).

The ‘intentional’ underpricing also serves as a strategy to induce truthful information

disclosure by the investors which would reduce the risk of a ‘broke’ issue or significant first-

Chapter 1. Introduction 1.2. Contributions and Challenges

2

day underpricing; the former scenario (i.e., broke issue) would tarnish the credibility of the

underwriter’s due diligence among investors, and the latter (i.e., substantial first-day

underpricing) would impair the underwriter’s reputation among potential future clients.

The IPO incentive lattice is undoubtedly more complex and goes beyond the simple

situation pictured above. Chapter 2 is devoted to further clarifying the aforementioned jargon

terms and presents the nuts and bolts of the process of going public. Within this game of

conflicting and aligning interests, a methodology that quantifies the short-term IPO payoff

would provide extra pieces of information, mitigating the risk of extreme short-term

overpricing or underpricing, and would lead to a new equilibrium state where the incentives of

all the players are commonly skewed towards minimizing the risk.

1.2. Contributions and Challenges

Figure 1-1 summarizes the main contributions of this thesis. As depicted, the main

objective is to quantify the short-term risk of IPOs. A comprehensive risk measure for an IPO

ought to be developed based on the distributions of the IPO’s future prices, estimated utilizing

the following two components of information: (i) firm-specific financial details, (ii) market

information. In conformity with the mathematical principles of market risk analysis employed

in the case of public companies, in this thesis, the short-term risk of an IPO of interest (IPOI)

is defined as a value-at-risk (VaR), to be ‘inferred’ from the corresponding ‘predictive’

unconditional probability density function (PDF) which models the uncertainty associated

with the stock price at a short-term post-IPO horizon (here, on the 90th day subsequent to the

issue day). Since VaR is, in essence, just a quantile, the ultimate objective pursued by this

research is, therefore, to estimate the underlying predictive PDF of the stock price of the IPOI

at the time point of 90 trading days after the issue day (PDF90IPOI). This problem, however, is a

fundamentally difficult one due to lack of market information and limited accounting history

at the IPO time, rendering estimation of the distribution of the ‘short-term’ payoff restricted.

To mitigate this problem, any proposed methodology needs to compensate for the paucity

of financial statements and absence of market history. To this end, this research relies on the

Chapter 1. Introduction 1.2. Contributions and Challenges

3

accounting information and market history of preceding IPOs which were ‘comparable’ to the

IPOI at their issuing times. Following Simak (2000) [133], this thesis develops a Data

Envelopment Analysis (DEA)-based high-dimensional similarity metric. This metric has the

capacity to objectively and algorithmically choose the ‘comparable’ IPOs preceding the IPOI.

As indicated, the comparables are identified using a DEA-based algorithm. One step

underlying this algorithm involves selecting comparables for an ‘efficient’ IPOI. The search

for a DEA model which can be utilized toward this end underscored the potential for a

noteworthy contribution to the literature of ‘super-efficiency’ models in DEA. This study

sought a model which would simultaneously fulfil all the characteristics listed below. Yet, to

Figure 1-1. This figure illustrates the main contributions of this thesis.

Chapter 1. Introduction 1.3. Road Map

4

the best of the author’s knowledge, no DEA study offers such a model. Hence, drawing on the

principles of ‘range directional’ models (Silva Portela et al., 2004 [132]), this thesis develops a

novel model which accomplishes all the following objectives (see, Section 6.2.4):

First. It provides a non-radial measure of super-efficiency for a ‘strongly’ efficient DMU.

Second. It is unit-invariant. Third. It is translation-invariant; in achieving this property, it

is crucial that the proposed model preserves the integrity of any input or output vector

which comprises a mix of positive and negative elements. Such a model would preclude

any complication imposed by decomposing a vector into its negative and non-negative

components. Moreover, only would the efficiency score yielded by such a model comply

with the role considered for it, in this thesis; that is, the efficiency score of a DMU acts as a

proxy for its distance from the ‘efficiency frontier’. Fourth. The objective function

minimizes the distance between the efficient DMU (IPOI) and its ‘reference point’, in terms

of both inputs and outputs. Furthermore, the objective function is structured such that the

achieved super-efficiency score can be interpreted in a similar fashion to the ones produced

by conventional super-efficiency models. Fifth. As pursued in this study, by means of

specifying further constraints, the model can be enhanced to cope with ‘non-discretionary’

factors, as well.

Once the comparables are identified, they are utilized in a recursive procedure which is

developed to estimate the distribution of the respective IPOI’s price at day 90 (i.e., PDF90IPOI).

This distribution would then serve as a basis for quantifying risk using a VaR measure.

1.3. Road Map

The process of estimating PDF90IPOI comprises two phases:

Phase I. In this phase, the closest comparable to a given IPOI is identified using the

aforementioned DEA-based similarity metric.


5

Phase II. This phase offers a Bayesian framework where the knowledge supplied by Phase

I is formulated and ‘refined’ to be utilized in estimating PDF90IPOI, and subsequently,

quantifying the short-term market risk of the given IPOI.

Sections 1.3.1 and 1.3.2 briefly describe these two phases. Section 1.3.3 provides a short

overview of the steps involved in calibrating the distance function (or conversely, similarity

metric), developed in Phase I. In the final stage of the risk quantification process, the proposed

methodology is examined (backtested) to ensure that the outcomes are reliable and the

promised objectives are achieved. The following two tests are employed to this end:

(i) Test of uniform cumulative probability values (see, Section 8.2.4.1)

(ii) Test of VaR break frequency (see, Section 8.2.4.2)

1.3.1. Phase I

Phase I is focused on identifying the comparable IPOs to a given IPOI. In this research, the

‘pool of candidates’ for comparables consists of preceding IPOs in the same sector as the

IPOI. Since a given IPOI succeeds its comparables in time, one must account for the time

disparity when selecting the comparables. Comparison criteria should, therefore, be adjusted

for inflation. Furthermore, GDP and the sector-specific index can be incorporated to control

for economic productivity and the performance of the respective sector since these

environmental variables can accelerate or impede the course of actions undertaken by a firm

and, consequently, its success. Once adjusted for the time effect and the impact of

macroeconomic and sector factors, a public firm which most resembled the IPOI at its IPO

stage would be selected as the most similar comparable.

By confining the search space to past IPOs, the proposed methodology accounts for many

IPO-specific events and characteristics, leveling grounds of comparison. As a case in point,

consider issuers’ incentive to manipulate pre-IPO earnings using income-increasing

accounting choices. By limiting the search space to former IPOs, the methodology controls

for the impact of earnings management. In addition, growth prospects would be matched more

accurately; an IPOI and a public firm, which share similar financial characteristics, may not


6

correspond in terms of growth potential. It is possible that the high growth regime has elapsed

for the public firm and it has reached a steady growth; whereas, the IPOI may be on the verge

of its high growth period.

Finding comparable firms could be a challenging task since it requires defining a similarity

metric (or conversely, a distance metric) between the IPOI and any preceding IPO. Any firm

can be pictured as a point in the multi-dimensional space of operating characteristics and

financial performance. In a heterogeneous space, where the dimensions are of different units

and quantities, a simple ‘Euclidean’ distance metric cannot represent the similarity between

the firms in the real world.

A DEA-based methodology is developed to algorithmically select the comparables. The

approach for selecting comparables was originally motivated by the work of Simak (2000)

[133]. The algorithm of Simak (2000) [133] is, however, extended and evolved into a new and

more comprehensive and analytically rigorous framework which not only addresses the

limitations of Simak’s (2000) [133] work but also provides enhanced features and functionality

and ensures a more reliable performance. Sections 3.3 and 6.2 present, in detail, how this

thesis develops a new DEA-based methodology for selecting comparables. The main

shortcomings of Simak’s (2000) work can broadly be listed as follows. The reader is referred

to Section 3.3 for a detailed discussion on the objective and limitations of Simak’s (2000) [133]

work.

(i) Simak (2000) [133] utilizes the basic input-oriented, radial model of CCR (CRS) or

BCC (VRS). In this thesis, this model is replaced with new non-oriented DEA

models which can cope with negative and ‘non-discretionary’ inputs and outputs.

(ii) Simak’s (2000) [133] methodology is incapable of identifying comparables for an

‘efficient’ firm of interest. This research develops an additional model to eliminate

this shortcoming.

(iii) Simak’s (2000) [133] approach is subjective and requires analyst input. The algorithm

suggested in this thesis is fully objective. To accomplish this goal, an additional

term, accompanied by a coefficient (parameter), is incorporated into Simak’s (2000)


7

similarity metric. The new similarity function is then calibrated using an

optimization program which is based on the Bayesian framework. The calibration

process does not require any human intervention.

1.3.2. Phase II.

In Phase II, the distribution of the IPOI’s stock price at the time point of 90 days after the

issuing day (PDF90IPOI) is estimated. The steps involved in the process are detailed in Section

6.3. Concisely, the joint probability density function of the following four parameters is

required in order to solve for PDF90IPOI: expected rate of return of the stock price (𝜇𝜇), volatility

of the stock price (𝜎𝜎), mean (𝜉𝜉) and variance (𝜔𝜔) of the normal distribution associated with the

log-normal distribution describing the first closing price normalized by the respective offer

price. This joint density function is represented by 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,ω) and cannot directly be

calculated for a given IPOI due to lack of market history. This research, therefore,

approximates it by the posterior joint distribution of the corresponding closest comparable.

Phase I supplies the ID of the closest comparable, and Phase II adopts the Bayesian framework

to construct the posterior distribution of the closest comparable through updating its prior

distribution, using the respective realized 90-day trajectory of stock prices.

Once the prior joint distribution of the four parameters (𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,ω)) is estimated

for each IPOI, the methodology can then solve for the distribution of the respective IPOI’s

stock price at day 90 (PDF90IPOI). This distribution can subsequently be used to quantify risk.

1.3.3. Calibration of the Distance Function

In addition to the aforementioned parameters (i.e., 𝜇𝜇, 𝜎𝜎, 𝜉𝜉, and ω), which can indeed be

considered as the parameters of the stock price model (see, Section 6.3.1), there exists one

more parameter that must be estimated. Denoted by 𝛼𝛼, it resides in the distance function

developed in Phase I and represents a coefficient (see, Sections 6.4 and 8.2.3). In order to

estimate this coefficient (i.e., 𝛼𝛼), the distance function is calibrated using a recursive-cyclic

algorithm. At each cycle, a value is assigned to 𝛼𝛼. Subsequently, Phase I and Phase II are

Chapter 1. Introduction 1.4. Organization of the Document

8

repeated for various IPOIs (276 IPOs in this thesis) given the assigned 𝛼𝛼. Using the Bayesian

framework, the method then computes the likelihood of the assumed 𝛼𝛼 conditioned on the

realized prices of day 90 for the participating IPOIs (276 IPOs). The above cycle is iterated

using different values of 𝛼𝛼. The optimum 𝛼𝛼 is selected as the one whose likelihood is

maximum across all the cycles.

1.4. Organization of the Document

The remainder of this document is organized as follows.

Chapter 2.

It provides background information on IPOs which is required to understand the challenges

associated with the risk assessment of IPOs.

Chapter 3.

It is composed of three main sections: The literature review presented in Section 3.2 offers

strong grounds for using a market-based comparables approach in quantifying the short-

term risk of IPOs. Section 3.3 reviews the existing methods for selecting comparable firms.

Section 3.4 surveys the literature in order to create an inventory of the financial factors

identified to be influential on the value and short-term performance of IPOs. These factors

will subsequently build the inputs and outputs of the DEA models.

Chapter 4.

The goal of Chapter 4 is to provide a comprehensive literature review on DEA as pertinent

to the methodology suggested in this thesis. Section 4.2 introduces basic DEA models, such

as CCR (CRS), BCC (VRS) and SBM. Subsequently, Sections 4.3 and 4.4 thoroughly

survey the advances in the fields of negative data and non-discretionary factors in DEA.

Chapter 1. Introduction 1.4. Organization of the Document

9

Chapter 5.

The focal intent of Chapter 5 is to review the merits of the Bayesian approach in risk

assessments. The Bayesian framework hosts a vast array of topics which extend beyond the

capacity of this thesis. The choice of the topics thus follows the specific objectives of this

thesis. Section 5.1 provides an introduction on the concept of VaR and enlightens why the

Bayesian framework would offer a comprehensive setting to tackle VaR problems. Sections

5.2 to 5.4 acquaint the reader with practical nuts and bolts as well as perils and promises of

the Bayesian and frequentist schools of thought. Section 5.5 is devoted to delineating the

promises of the Bayesian wisdom in risk management.

Chapter 6.

The proposed methods are presented in Chapter 6. Sections 6.2 and 6.3 provide a detailed

discussion on Phase I and Phase II. The method of calibrating the distance function is

explained in Section 6.4.

Chapter 7.

Chapter 7 introduces the data used for the study and provides information on the sources of

data. It describes the process of collecting data and matching 9-digit CUSIPs across

different resources. Chapter 7 also gives an overview of the steps involved in data cleaning,

combining, re-classifying, and adjusting.

Chapter 8.

The results of the proposed methodology (Chapter 6) are presented and examined (i.e.,

backtested) in Chapter 8.

Chapter 9.

This chapter summarizes the key conclusions and highlights potential future research

avenues.

10

Chapter 2. LITERATURE REVIEW ON IPO

2.1. Introduction

This chapter provides a brief overview of the process of going public and the details

surrounding initial public offerings (IPOs). It is explained why companies decide to go public

and when the best time for an initial public offering would be (Sections 2.2 and 2.3). The

shares of an issuing company can be priced and allocated under different mechanisms. These

mechanisms are introduced in Section 2.4. Once the shares are priced and allocated in the

primary market, secondary market trading commences. Sections 2.5 to 2.7 focus on the details

of post-IPO trading, the ‘underpricing’ phenomenon, and the issuing company’s post-IPO

long-term performance. Section 2.8 concludes this chapter by describing the conventional IPO

valuation methods.

2.2. Initial Public Offering Motives

Initial Public Offering (IPO) is an event where a private company (an issuer) sells a portion

of its authorized shares to the public investors for the first time. IPO is one of the most

fundamental decisions a corporation may face in its life. Through the IPO decision, the issuer

commits to major responsibilities including shared ownership, increased monitoring, in

addition to direct one-time filing and offering costs. Thus, there must be strong motives that a

company decides to offer a fraction of its ownership to public.

Pagano et al. (1998) [104] studied the causes of going public by investigating the pre-IPO

characteristics of a private firm and the post-IPO performance of the public firm. They

examined a large sample of Italian private companies in the 1982-1992 period. The outcome

of their research indicates the following general motivations for initial public offerings:

(i) The main objective of going public is to attain a high market-to-book ratio. High

market-to-book ratios may be the indicative of high growth opportunities in the

Chapter 2. Literature Review on IPO 2.2. Initial Public Offering Motives

11

respective sector which demand high investments. High market-to-book ratios can

also originate from temporary mispricing in hot markets. Pagano et al. (1998) [104]

observed reductions in investments during the post-IPO period. This outcome

supports the second perception and implies that issuers may spot hot markets and

decide to go public.

(ii) The cost of debt decreases for independent IPOs (i.e., not carve-outs). It is not

certain whether the cheaper debt occurs due to the greater transparency of a public

firm’s business activities or due to increased bargaining power resulting from

increased number of alternative lending institutions.

(iii) According to the empirical analysis of Pagano et al. (1998) [104], portfolio

diversification does not seem to be a significant IPO inspiration. The high turnovers

of controlling shareholders in some firms after the three years from the IPO time,

however, support the cashing-out scenario in IPOs.

Unique to the Italian market, Pagano et al. (1998) [104] conclude that in general, Italian IPO

firms are greater in size and the IPO market in Italy is not a financing engine of many young

firms. On the contrary, the U.S-based research (e.g., Mikkelson et al., 1997 [99]) reveals that the

IPO market in the U.S. is a means for many young firms to raise capital for further growth.

Another significant incentive to go public is post-IPO acquisition plans. Brau and Fawcett

(2006) [31] investigate the reasons for going public by surveying 336 chief financial officers

(CFOs) whose companies had successfully completed or attempted but withdrawn IPOs during

the time period of January 2000 to December 2002, in the U.S. The outcome indicates that

acquisition is the primary objective. The IPO proceeding can be used to accomplish

acquisitions. Alternatively, the IPO firm can be a target since the post-IPO transparent

business activities and market value facilitate the acquisition.

From an overall perspective, the IPO motives can be categorized as endogenous and

exogenous factors. Examples of endogenous factors can be financial motives (e.g., cheaper

debt) and governance restructuring (e.g., venture capitalists’ exit). ‘Hot’ issues markets and

Chapter 2. Literature Review on IPO 2.3. IPO Timing

12

strategic motives (e.g., competitors going public) can exogenously trigger a private firm to go

public.

2.3. IPO Timing

IPO timing can be determined based on the development stage of the firm. After a period of

high growth, a firm may decide to finance the next stages of its projects by offering shares to

public. A firm may also gauge market conditions to determine when to go public. In a study

about the impact of the underwriter reputation on an IPO’s success, Lee (2011) [89] concludes

that an IPO’s success can partially be attributed to the correct timing of a reputable

underwriter.

During ‘hot’ issues markets, many firms decide to go public to exploit the favourable

market conditions and achieve high market-to-book ratios (Pagano et al., 1998 [104]). Hot issues

often occur when the increases from the offer prices to the aftermarket prices are above the

expected market risk premium for a specific cluster of stocks. The above-average short-term

returns prompt investors to funnel more wealth to the respective industry which in turn

motivates more issues. The cycle of self-fulfilling prophecy continues until the market

optimism disappears. New technologies becoming obsolete or reaching their growth ceilings,

unfavourable information spillovers about the newly public firms’ activities and/or unfulfilled

pre-IPO promises can be among the factors leading to ‘cold’ markets.

Hot issues create a temporary window of opportunity for the private firms within the

respective market to have successful offerings. Hence, it is important to identify key hot

market drivers. The hot issues markets have been examined and documented by many

researchers (e.g., Ibbotson, 1975 [76]; Ibbotson and Jaffe, 1975 [77]; Ritter, 1984 [118]; Draho,

2004 [48]; Ljungqvist et al., 2006 [92]). Different scenarios can lead to hot issues markets. These

scenarios span both the investor (demand) and issuer (supply) sides. Issuers learn from the

public offering of their competitors. An IPO completed successfully by a competitor motivates

other firms in the same industry to go public. On the investor side, a new technology and/or

industry can cause investors’ demand to surge. Furthermore, investors may approximate the

Chapter 2. Literature Review on IPO 2.4. IPO Mechanisms

13

expected performance of a new issue by the most recent successful IPOs. Such irrational

extrapolating may fuel hot markets as well.

2.4. IPO Mechanisms

The IPO mechanism outlines how the underwriter prepares a firm for its IPO and sells

shares to the investors. Draho (2004) [48] categorizes IPO mechanisms into three classes: firm

commitment or book-building, fixed-price offering, and auctions. Prior to delving into more

details, it is worthwhile to note that the primary focus of the IPO discussion in this document

is on the U.S. IPOs unless otherwise stated.

2.4.1. Book-Building (Firm Commitment)

Under the book-building mechanism, the underwriter is responsible for all the preliminary

legal steps for the public offering, share pricing and share allocation. The notion of ‘book-

building’ refers to the underwriter soliciting the investor demand through road trips and

marketing activities and building an order book accordingly. The terminology of ‘firm

commitment’ denotes the underwriter’s commitment to sell the inventory at the price

determined in the final prospectus (Draho, 2004 [48]).

2.4.2. Fixed Price Mechanism

The price is determined in advance if the offering is performed using the fixed price

mechanism. The fixed price offering does not incorporate the firm commitment and book-

building components of the book-building mechanism. The underwriter and issuer set the final

offer price without directly gauging the investor demand in advance. The shares are then

allocated based on the orders submitted. Under this mechanism, the underwriter does not

guarantee the sale of the entire inventory (Draho, 2004 [48]).

Chapter 2. Literature Review on IPO 2.4. IPO Mechanisms

14

2.4.3. Auctions

The most commonly used auction mechanisms are uniform price auction (or Dutch auction)

and discriminatory price auction (Draho, 2004 [48]). In the auction mechanism, the underwriter

is not directly involved in setting the offer price. The investors submit their orders and limit

prices. After collecting all the orders, based on the auction type, the offer price is determined.

If the auction is of the uniform type, the offer price is determined as the intersection of the

supply curve (which is fixed) and the demand curve. The shares are then allocated, at the offer

price, to the investors whose bids exceed the offer price. The orders of the investors with bids

at the offer price may partially be filled, based on the supply availability. In the discriminatory

price auction, the shares are allocated starting from the highest bid until no share remains.

2.4.4. Prevailing Mechanism: Book-Building

Despite the generally higher costs associated with the book-building mechanism, it is the

most commonly used mechanism in the U.S. (Ljungqvist et al., 2003 [91]). The book-building

mechanism can lead to the highest efficiency; that is, it can maximize the expected proceeds

(Biais and Faugeron-Crouzet, 2002 [26]). To reach the theoretical maximum, the underwriter

must be informed about different investors’ true pricing of the IPO and set the final offer price

accordingly. This cannot be achieved in practice. However, the mechanism of the book-

building allows the underwriter to acquire maximum information from the investors (Cornelli

and Goldreich, 2003 [43]).

The underwriters often determine a preliminary range for the share price, based on their

own valuation of the firm. They then collect information from the investors about how much a

share is worth for them (i.e., asking for ‘private’ information). The expected value of the

disclosed prices by investors determines the final IPO offer price. Since the investors have the

incentive to understate their true opinion about the share price, it is required that the

underwriters take some steps toward aligning their incentives with the investors’. The study of

Benveniste and Spindt (1989) [20] elaborates on some of these practices.

Chapter 2. Literature Review on IPO 2.5. Post-IPO Trading

15

Benveniste and Spindt (1989) [20] show that in the presence of information asymmetry, the

underwriter can solicit the investors’ honest opinion of the share price by allocating more

shares to those who indicate higher prices. If the majority of the investors report high prices

(i.e., the offering becomes ‘hot’), the underwriter may underprice the shares; that is, instead of

setting the final offer price to the high end of the preliminary offer range, the price is

decreased to some degree. The amount of this reduction varies among different offerings and

can be perceived as the cost of information acquisition. Such intentional underpricing is,

therefore, another practice to reward the truthful information disclosure when the overall

interest is high and discriminating in allocation does not apply. In practice, the private

information disclosure can be compensated using a combination of underpricing and

discrimination in share allocation.

As follows from the discussion above, under the book-building mechanism, the

underwriters do not discriminate in price, but only in allocating shares. The review of different

research indicates that under the book-building mechanism, institutional investors often

receive precedence for IPO shares in the primary market (e.g., Benveniste and Spindt, 1989

[20]; Ibbotson and Ritter, 1995 [78]; Cornelli and Goldreich, 2001 [42]; Ljungqvist and Wilhelm,

2002 [90]; Draho, 2004 [48]).

Note: Henceforth, the IPO mechanism is assumed to be book-building in this document.

2.5. Post-IPO Trading

Trading in the secondary market starts following the share allocation in the primary market,

at the IPO final offer price. The share prices are then determined by independent agents

competing in a public market.

2.5.1. Difference between Offer Price and First-Day Opening Price

Share allocation process and the number of shares offered remain the main reason for the

discrepancy between the offer price and the first-day opening price of an IPO. Interested

Chapter 2. Literature Review on IPO 2.5. Post-IPO Trading

16

investors who do not acquire any share or the desired fraction in the primary market (at the

IPO final offer price) can purchase them in the secondary market once the stocks hit the

market. Therefore, secondary market trading is a game between the investors with IPO shares

willing to sell and those who wish to receive a (larger) stake in the newly- public firm. Any

gain or loss from these trades remains with the selling parties and does not affect the issuer’s

IPO proceeds. It is the volume and types of orders placed with the brokers that determine the

opening price of the first day. A significant investor demand can result in (extraordinarily)

appreciated prices in the secondary market immediately following the primary market.

2.5.2. Post-IPO Underwriter Services

The underwriting contract often obligates the underwriter to price stabilizing activities in

the IPO aftermarket. What motivates the issuer to promise price stabilization is encouraging

investor participation in the primary market and increasing the IPO final offer price.

Underwriters obligate themselves to different types of price stabilizing activities such as

setting constraints and monitoring reselling for quick gains (i.e., flipping) in the aftermarket

(Draho, 2004 [48]). The standard approach to the price stabilization is using the ‘overallotment’

or ‘greenshoe’ option which can be explained as follows:

The underwriter borrows additional shares from the issuer and resells them in the primary

market, at the offer price. In the case of poor short-term performance (i.e., price decline in the

secondary market), the underwriter purchases back these additional shares. The practice is an

attempt to create a demand for the flipped shares and prevent price declines in the short-term.

If the prices increase in the immediate aftermarket, the exercise of the overallotment option

gives the underwriter the right to purchase the additional shares at the IPO final offer price

from the issuer. This implies that the underwriter does not have to cover the short position and

the size of the offer increases by the amount of additional shares sold in the primary market

(Aggarwal, 2000 [3]; Geddes, 2003 [57]).

Another important feature of the aftermarket trading is the ‘lock-up’ period set in advance

by the underwriter; that is, the pre-IPO shareholders are not allowed to sell their shares in the

Chapter 2. Literature Review on IPO 2.6. Underpricing

17

secondary market, for a predetermined period following the IPO. The purpose of the lock-up

period is to signal the insiders’ belief in the firm’s prospects (Ayayi, 2005 [14]).

2.6. Underpricing

In the IPO context, significant research has been devoted to underpricing (e.g., Hughes and

Thakor, 1992 [74]; Garfinkel, 1993 [56]; Arthurs et al., (2008) [11]). Understanding this

phenomenon is essential to correctly quantifying the risk associated with the post-IPO market

prices. Underpricing is defined as the excess of the short-term post-IPO market prices over the

IPO final offer price. It can be determined as the difference between the IPO final offer price

and the first-day closing price; however, the length of the period considered to estimate

underpricing can be longer (e.g., several weeks). Underpricing is often the consequence of a

two-step process: intentional underpricing arranged by the underwriter and/or issuer and post-

IPO market forces.

In the theories proposed to explain the underpricing phenomenon, the IPO is considered as

a game whose key players are the underwriting committee, issuer, and investors. The aligned

incentives or conflicting interests of these players can lead to underpricing. Most of the

proposed hypotheses focus on the intentional underpricing.

One of the most significant hypotheses of intentional underpricing perceives underpricing

as a cost that the underwriter and issuer endure to acquire the investors’ private information.

This hypothesis, suggested by Benveniste and Spindt (1989) [20], is discussed in detail in

Section 2.4.4. Agency conflicts between the underwriter and issuer can also lead to

underpricing. If information asymmetry exists between the two parties, the underwriter has the

incentive to further underprice the new issue. Greater underpricing by the underwriter reduces

the marketing effort required under the book-building mechanism. This theory is first

suggested by Baron and Holmström (1980) [18] and Baron (1982) [17]. Other theories have been

postulated to further enlighten the underpricing phenomenon. Ibbotson and Ritter (1995) [78] is

a comprehensive study which assembles these theories.

Chapter 2. Literature Review on IPO 2.7. Post-IPO Long-Term Firm Performance

18

The final underpricing incorporates a behavioural component besides intentional

underpricing. Interested investors with strong beliefs in the short- or long-term

outperformance of an IPO can cause the aftermarket prices to rise.

Due to the lack of empirical data, it is difficult to determine the contribution of intentional

underpricing. Roosenboom (2012) [121] is the first study which uses empirical evidence from

French IPOs to support the anecdotal claim about intentional underpricing. The researcher

obtained access to the underwriter reports of 228 IPOs during the 1990-1999 period. The

assessment of these reports reveals the percentage of intentional underpricing in the respective

firms. Roosenboom (2012) [121] documents an average intentional underpricing of 18.2%,

where reputable underwriters discount less.

In considering Roosenboom’s (2012) [121] research, it should be noted that there exists a

fundamental difference between the book-building mechanisms in France and the U.S. In

France, the underwriter solicits investor interest prior to the IPO registration. Thus, the

preliminary price incorporates much of the investor sentiment and has already been discounted

to compensate for the information revelation.

2.7. Post-IPO Long-Term Firm Performance

The preceding sections focused on the pre-IPO process and short-term post-IPO

performance. Understanding the post-IPO long-term performance is equally important.

According to Ritter (1991) [119], such information would impact the IPO investors’ long-term

trading strategies. Furthermore, it would reveal how much the short-term prices are influenced

by proper IPO timing (i.e., going public in hot markets) and behavioural components like

buying frenzy.

Some empirical evidence indicates poor post-IPO long-term performance for the equity of

the firms. Ritter (1991) [119] is the first academic study which documents this phenomenon

based on an empirical analysis. The study examines 1526 new issues during the period of 1975

to 1984 across different industries. These issuing firms are compared to their matching firms

Chapter 2. Literature Review on IPO 2.8. IPO Valuation

19

(i.e., public firms matched in terms of industry and market capitalization). The comparison

shows that on average, the issuing firms underperform the stock market in the three-year

period following their IPOs. The outcomes show greater underperformance for the firms going

public during hot markets. This evidence suggests that the short-term outperformance of these

firms can be attributed to the firms taking advantage of the temporary ‘window of opportunity’

to go public (see, Section 2.3).

The long-term underperformance can also be the consequence of the overvaluation of the

firms by investors at the time of going public. In the case of the firms with rapid pre-IPO

growth, investors may extend high pre-IPO growth rates over a long period in the future and,

hence, assign high values to the firms. Certo et al. (2009) [33] postulate that the long-term

underperformance can be the outcome of the sophisticated competitions to which a newly-

public firm is exposed. The firm’s strategic plans and resources to engage actively in the

competition define its long-term performance.

Studying a sample of 4753 IPOs during the 1970-1999 period, Loughran and Ritter (1995)

[93] also report smaller post-IPO 5-year average returns compared to a control group of non-

issuing firms. An interesting feature of the long-term underperformance is its persistence

despite being documented in different sources. Loughran and Ritter (1995) argue that the

pattern of short-term outperformance and long-term underperformance does not disappear

(i.e., arbitraged away) since there always exist optimistic investors with strong beliefs in the

fate of the IPO firms. These investors systematically overvalue the IPOs and it takes a long

period for them to realize the actual potential of the firms.

2.8. IPO Valuation

The first and perhaps the most critical step in the process of going public is the valuation of

the firm. According to Draho (2004) [48] and Roosenboom (2012) [121], underwriters use the

comparable multiples technique (see, Section 2.8.2) more frequently for this purpose. It is

indicated that the most commonly used multiples are price-to-earnings (P/E), price-to-cash

flow, price-to-sales, and enterprise value ratios.


20

Underwriters, particularly in the U.S., do not often disclose their approach to valuing IPOs.

However, based on the evidence provided by Draho (2004) [48] and Roosenboom (2012) [121],

the methods described in Sections 2.8.1 to 2.8.3 can be considered standard.

2.8.1. Discounted Free Cash Flows

In theory, the value of equity can be determined as the sum of the present value of the

expected free cash flows that the shareholders would receive in the future. The expected free

cash flows can be discounted at the risk-adjusted discount rate (see, Ross et al., 2010 [122]).

Alternatively, one can adjust the expected cash flows for risk (i.e., determine certainty

equivalents) and use the risk-free rate to discount the certain cash flows.

The accuracy of the discounted cash flow (DCF) method is limited to the accuracy of the

cash flow projections. Due to the strong theoretical basis, this technique can yield correct

results if free cash flows are projected accurately. In practice, the cash flows are often

predicted by extrapolating the past events to the future and/or by assuming fixed (i.e., pre-

committed with no flexibility) and limited scenarios for the future operations and activities of

the firm. These somewhat arbitrary and highly subjective projections (common in practice)

would adversely impact the accuracy of the valuation outcomes.

The accuracy of the outcomes is also sensitive to the discount rate used to calculate the

present value of the expected free cash flows. In the case of IPOs, where there is no stock price

history, finding an appropriate risk-adjusted discount rate can be challenging. Often the

discount rate for IPOs is approximated by the industry cost of capital with some case-specific

adjustments (Draho, 2004 [48]).

In addition to the DCF method, this section can be generalized to include other discounting

techniques as listed below:

(i) Dividend Discount Model (DDM)

(ii) Residual Income Model (RIM)


21

The dividend discount model (DDM) would be a suitable technique in the case of

companies with steady dividend payments. Considering that dividends form the only stream of

cash flows to the shareholders, DDM states that the current stock price can be computed by

summing the present values of all the future cash flows to the shareholders.

Although the DDM method avoids the challenges associated with free cash flow

projections, it is incapable of generating reliable outcomes for the firms without regular

dividend payments. Furthermore, as with the case of DCF, the accuracy of the outcomes

would be highly sensitive to the estimates of the dividend growth rate and the discount rate

(i.e., shareholders’ required rate of return).

The residual income model (RIM) is another frequently employed technique to estimate the

‘fair value’ of stocks. The technique is particularly proven to be useful in valuing firms with

no steady dividend payments or negative free cash flows. Essentially, RIM assumes that the

current stock price of a company is equal to its current book value plus the sum of present

values of all future residual incomes, where ‘residual income’ is defined as the difference

between the corresponding net income and shareholders’ expected dollar return (i.e., cost of

equity × shareholders’ equity).

The rationale behind RIM is to account for the equity cost in addition to the cost of debt

which is already incorporated into the figure of ‘net income’. Thus, the net income which is

adjusted for the equity suppliers’ expected return for the period would be a more

representative figure of a company’s profit. This approach would clarify if the company is

indeed profitable when its net income is adjusted for the risk taken by its shareholders.

RIM is indeed equivalent to DDM. The dividend-based framework can be transformed into

the residual income-based framework by theoretically substituting dividends by net income net

of retained earnings. Residual income with the above-indicated form is then obtained

following some algebraic reformulations detailed in Palepu and Healy (2008) [105]. Although

RIM and DDM are mathematically identical, the latter is only applicable to the companies

with predictable and steady dividends; whereas, by means of the aforementioned substitutions,

the former allows for the valuation of a wider variety of companies.


22

As to the challenges associated with RIM, the projections of future residual incomes share

the same complications as the free cash flow predictions under the DCF method. Moreover,

finding an appropriate discount rate (i.e., cost of equity in this case) makes the valuation

practice challenging, especially in the lack of market information

2.8.2. Comparable Multiples

Models described in Section 2.8.1 are sometimes referred to as ‘absolute’ models. The term

implies that ‘value’ is estimated with reference to the fundamental financial information and

the models do not utilize the information available from other companies in the respective

sector to estimate value.

The method of comparable multiples, on the other hand, is considered a ‘relative’ model

implying that the model estimates value with reference to other companies. This technique

chooses a comparable firm or a group of comparable firms and uses their multiples (e.g.,

price-to-earnings and market-to-book ratios) to determine the value of the firm of interest. In

theory, the comparable firm should match the firm under scrutiny in terms of fundamental

characteristics such as net income and its growth rate, risk, and the product or service offered.

Section 3.3 discusses different approaches for selecting comparable firms in practice.

Some of the weaknesses associated with the comparable multiples method can be listed as

follows (Draho, 2004 [48]): (i) The method is limited to sufficiently diverse markets where

there exist several public firms comparable to the firm of interest, (ii) the accuracy of the

valuation outcome is affected by the accuracy of selecting the comparable firm(s), (iii) all the

assumptions and financial incentives underlying the price(s) of the comparable(s) would be

incorporated into the value estimates of the firm under scrutiny.

As indicated at the onset of this section, the method of comparable multiples is the most

frequently used IPO pricing technique and, for that matter, is the most commonly cited

technique in the IPO literature. Such interest in the comparable multiples technique is not

unique to IPOs. Comparables are extensively used to estimate diverse financial ratios. Section


23

3.2 clarifies the implication of such a wide usage of comparables on developing a short-term

risk metric for IPOs.

2.8.3. Other Techniques

In addition to the aforementioned standard models, depending on the characteristics of the

IPO of interest, underwriters may choose to improve the valuation outcomes using specific

techniques. The remainder of this section provides a summary on two such methods: asset-

oriented methods and real options.

2.8.3.1. Asset-Oriented Techniques

Asset-oriented techniques are often used to provide tentative value estimates for asset-

intensive businesses, such as manufacturing businesses and financial services companies. The

value is often assumed to be the liquidation value of the business; that is, if the company

ceases its business activities, what its assets would be worth. To arrive at an estimate of the

value, the approach assesses the market worth of tangible assets, net of any liability. Examples

of tangible assets would be assets in the raw material inventory, finished goods, and fixed

assets (e.g., property, plant and equipment). Liquidation costs and other miscellaneous costs

associated with terminating a business, such as charges levied by the employees, should also

be taken into account.

One shortcoming of the above-indicated asset-oriented technique is the lack of explicit

accounting for intangible assets and the potential of the business to grow in the future and

create value from its tangible assets (Draho, 2004 [48]). Thus, asset-oriented methods have

limited applications in the IPO industry mainly comprising young businesses which would be

prone to undervaluation using this method.

2.8.3.2. Real Options

Real Option Analysis (ROA) is a framework which has been developed to analyze and

evaluate any arbitrary investment opportunity. ROA is based on the theory of financial option

pricing. However, in contrast to a securities option, a real option is an investment in a real


24

asset, for example, a business project. The technique gives the right, but not the obligation, to

exercise a specific decision, such as to withdraw from or expand the project, at some future

time. Although the initial seed of ROA was planted with the early papers such as Black and

Scholes (1973) [27] and Merton (1973) [98], it was Stewart Myers who first used the terminology

‘real option’ in 1984 (Myers, 1984 [101]). He argued that the standard DCF method

underestimates the value of the real options attached to growing businesses.

ROA is an approach which combines strategic intuition and rigorous analytic computations.

ROA is perceived as a complement to the DCF technique and adds extra flexibility to the

valuation framework. The value of a project or a company (i.e., a portfolio of various projects)

can be described as a combination of its value with no flexibility and the value of its options.

Based on anecdotal evidence from U.S. underwriters (Draho, 2004 [48]) and firm evidence

from French IPOs (Roosenboom, 2012 [121]), the methodology of real option has not yet gained

a wide application in the IPO industry. The reason is that the technique is mathematically

intensive; yet, its contribution to accuracy remains unclear. ROA does not pose a suitable

solution to our problem, either. The reason can be described as follows: The main objective of

this research is to quantify the short-term risk of investing in IPOs, and the risk will be

estimated based on post-IPO market prices (see, Section 6.3). The reluctant and narrow

acceptance of the real options method by investors minimizes its impact on investors’

perception of future performance of companies, and consequently, their stock prices. Thus, the

seemingly insignificant impact of the real options method on market prices makes the

technique extraneous to this research.

25

Chapter 3. LITERATURE REVIEW ON METHODOLOGY:

COMPARABLE-BASED METHODS

3.1. Introduction

This chapter centers on the literature of the ‘comparable-based’ methods. In Section 3.2, it

is established why it would be a prudent approach to base the risk assessment of IPOs on the

market information of the respective comparables. Section 3.3 covers the history of the

techniques involved in selecting comparables and extensively discusses their shortcomings.

Based on the survey conducted in this section, this thesis later proposes a new technique for

selecting comparables which addresses these limitations and augments the literature of

comparable firm selection (see, Chapter 6). The final section surveys the work focused on

identifying the factors impacting the IPO valuation and pricing. The scrutiny of such studies

assembles a set of factors which subsequently serve as the comparison criteria (dimensions of

comparison) between the IPOI and the candidates to act as its comparables.

3.2. Comparable-Based Risk Analysis

This section aims to establish why founding the risk analysis on comparables offers a

promising approach to quantify the market price-based risk of IPOs. By the end of this section,

it will become clear why top valuation specialists including bulge bracket investment banks

and ‘first team-ranked’ analysts frequently base their valuation on comparables. Furthermore,

evidence is provided that such specialists will perhaps be the key providers of direction to the

market. Thus, considering the prevalence of the comparable-based approach among analysts

and its potentially significant role on market prices, this approach can be suggested as the

building block of the risk quantification of IPOs (see, Chapter 6). To state it simply, if the

information derived from comparables impacts prices, then, a comprehensive risk metric ought

Chapter 3. Literature Review on Methodology: Comparable-Based Methods


26

to be designed such that it extracts maximum information from the most representative group

of comparables.

The comparable-based valuation method is often contrasted with more theoretically-

acceptable discounting models. The wide interest in the former compared to the latter can be

justified from three different standpoints:

(i) The outcomes of the comparable-based technique are more reflective of market

characteristics and periodicities, behavioural components as well as investors’

general and specific perception of value (see, Section 3.2.1).

(ii) There are intrinsic limitations associated with the discounting (‘absolute’) models

(see, Section 3.2.2).

(iii) In terms of equity price projections, the increased complexity in case of discounting

models is often compensated with insignificant gain in accuracy (see, Section

3.2.3).

In the remainder of this discussion, Sections 3.2.1 to 3.2.3 detail each item on the above-

indicated list. Section 3.2.4 closes this discussion by providing evidence on the prevalence of

the comparable multiples method among U.S. bulge bracket investment banks and ‘first team-

ranked’ analysts whose reports often cause strong market reactions.

3.2.1. Comparable-Based Method - More Reflective of Market

Conditions

Discounting techniques such as DCF determine value primarily with reference to the ‘free’

cash generating potential of the business. These techniques do not utterly account for many

behavioral components, investors’ perception of value, or market conditions which impact

stock prices. Some of such details may be reflected in individual investors’ projections of

future free cash flows or discounting rates; e.g., in hot markets, investors tend to forecast

future free cash flows more optimistically. However, in general, the DCF framework does not

account for aggregate investor preferences, opinions and expectations, herding dynamics, etc.



27

As an example of the investors’ perception of value, consider R&D costs. Research has

shown that the R&D expenditures are positively correlated with the prices (e.g., Bartov et al.,

(2002) [19]). This implies that market assigns a positive value to the option of the firm to

commercialize the product (and expand) upon the success of the R&D project. Part of the

information asymmetry present at the IPO stage can indeed be attributed to how investors

value these options (Hall, 2005 [64]). Contrary to what has just been stated, under any

discounting method, R&D costs would be considered period expenses, and may even lead to

negative free cash flows for the period. In other words, within the DCF framework, not only

do R&D expenditures not contribute to the value, but they adversely impact the value.

Behavioral components, such as reacting to the underwriter’s reputation and the retention

ratio of the insiders’ shares, also cannot be directly incorporated into a cash flow-based

valuation framework. They have no influence on the firm’s operations and its capability to

generate free cash flow and, yet, they impact prices (e.g., Draho, 2004 [48]; Certo et al., 2009

[33]).

In case of such complicating market features, a market-based valuation approach would

lead to more descriptive outcomes. As a case in point, again, consider R&D costs.

Furthermore, consider the following two players: (i) a current IPO (i.e., IPO of interest) (ii) a

past IPO (i.e., a currently public firm whose IPO took place in the past). Given similar R&D

expenditures under similar aggregate (macro) market conditions at the issue time, it can be

assumed that the R&D information would be factored in for the current IPO in a similar

fashion to the past IPO. Underlying such assumption is the fact that other things being equal,

the average investor’s recognition of value drivers does not change critically over time.

It is worthwhile to note here that mathematically rigorous methods have been developed in

attempt to address some aspects of the above-indicated complications, surrounding market

value estimations. An example would be the variety of stochastic differential equation (SDC)-

based techniques which have been developed in attempt to capture part of the uncertainty

associated with a firm’s future free cash flows, and hence, improve the estimation of its



28

current stock price (e.g., Schwartz and Moon, 2000 [127]; 2001 [128]; Maya, 2004 [97]; Berk et al.,

2004 [23]; Wu et al., 2008 [139]). It should, however, be noted that even the most comprehensive

valuation technique, as long as it stays private and is not shared information among investors

or market makers, would not deliver accurate estimations of the market prices. Furthermore,

research has shown that such analyst-specific techniques do not supplant the standard

valuation techniques discussed in Section 2.8; top-tier underwriters and first-team ranked

analysts only marginally depend on these mathematically demanding techniques in estimating

market values (Roosenboom, 2007 [120]).

3.2.2. Discounting Methods - Theoretically Sound but Practically

Rigorous

As explained in Section 2.8.1, the DCF technique has a firm theoretical basis. It calculates

the value as the sum of the discounted expected future free cash flows. Therefore, the accuracy

of the outcomes of this method heavily relies on the quality of cash flow projections.

Forecasting cash flows is the most error-prone and challenging step of the DCF method.

Future free cash flows generated by a firm fluctuate over time due to the occurrence of various

events such as market moves, competition, or success/failure of the launch of a new product.

Some of these events may cause significant up or down jumps in the free cash flows. The free

cash flow can be considered as a combination of a deterministic dynamic component and a

stochastic component. As mentioned, in theory, the DCF technique computes the expected

values of the free cash flows at given points in time and discounts them back to the present.

This would require considering many different scenarios for the evolution of the free cash

flows. Subsequently, for each point in time, the technique should calculate the expected value

of the respective incremental free cash flows across different scenarios.

In practice, however, the expected free cash flows, under the traditional DCF technique, are

estimated in a somewhat arbitrary fashion, subjective to the analyst’s opinion and/or by

extrapolating the past trends to the future. Furthermore, under this technique, limited investing



29

and operating scenarios are assumed for the firm and the firm pre-commits to each scenario

with no flexibility for change conditional on future events.

The practical challenge of the DCF technique is not limited to forecasting future free cash

flows. As also detailed in Section 2.8.1, finding an appropriate discount rate, which can be

used to calculate the net present value of the future cash flows, poses another challenge. This

shortcoming is pronounced in the case of IPOs where the businesses have no stock price

history. Due to the mathematical sensitivity of the discounting methods to the growth rate and

discount rate, any minor error in the estimations of these two factors could lead to major

swings in the final estimates of value.

The aforementioned practical obstacles have impeded the broad adoption of the

theoretically-strong discounting methods by the analysts. Within the context of French IPOs,

Roosenboom (2007) [120] has shown that often it is in the volatile or high-growth markets

where underwriters assign greater weights to the outcomes of the DCF method. The reason can

be stated as follows: In high-growth markets, analysts can choose future growth rates of cash

such that the valuation outcomes closely follow market prices, and then, they will have a broad

latitude in defending their choices of growth rates. Furthermore, once volatility hits stock

prices, investors tend to rely more on the DCF method since as an ‘absolute’ method (see,

Section 2.8.2), it determines value primarily based on the operations of the firm and its

potential to generate cash in the future.

3.2.3. Discounting Methods - Added Complexity with Uncertain

Accuracy Gain

Despite its more theoretically firm foundation and computationally intensive process,

research shows that the DCF method has similar accuracy to the comparable multiples method

in predicting market prices.



30

3.2.3.1. Evidence from Non-IPO Cases

Using the DCF method, Kaplan and Ruback (1995) [82] valued a sample of 51 highly

leveraged transactions (HLTs) which took place in the 1983-1989 period. The study shows

that the outcomes of the DCF and comparable multiples method stay similar and they both

have comparable accuracy in predicting the market values of the transactions. The comparable

multiples method is centered upon market value to EBITDA multiples, each emerging from a

different definition of comparable company. The researchers argue that the comparable

multiples technique is capable of outputting more accurate results than the DCF method given

that the company under scrutiny and its comparable correspond and be proportionally

comparable in risk and size of cash flows. Furthermore, the choice of a multiple could impact

the accuracy of outcomes since the multiple is the reference connecting the relative values of

the two companies. The comparable multiples approach which meets all the above-indicated

criteria is expected to predict market values more accurately than the DCF method does since

it additionally accounts for market conditions and characteristics (see, Section 3.2.1).

In a different setting from Kaplan and Ruback’s (1995) [82] study and within the context of

businesses emerging from Chapter 11 bankruptcy, the research of Gilson et al. (2000) [59] also

attested to the approximately akin outputs of the DCF and comparable multiples techniques.

The objective of the study was to compare the following two figures for 63 ‘reorganized’

companies over the period of 1984-1993: (i) the market value post reorganization (ii) the value

obtained using the cash flow predictions that had been supplied as part of the process of

Chapter 11 bankruptcy. The study also estimates value using the comparable multiples

method. The multiple is selected to be the industry median of the ratio of total

capital/EBITDA. The performance (i.e., value) measure for the firm under scrutiny is assumed

to be the projected EBITDA for the first year after reorganization. The study concludes that

the DCF and comparable methods generate similar results and lead to similar level of accuracy

in predicting market values.



31

The comprehensive research conducted by Asquith et al. (2005) [13] further supports the

notion of indifference between the DCF model and the comparable multiples method. The

primary objective was to create a unique database which collects and categorizes the details

provided in analyst reports, such as price targets and valuation techniques. To this end, the

study manually scrutinized 1126 equity analyst reports released during the 1997-1999 period.

In order to be included in the study, analysts had to be members of Institutional Investor’s All-

American Research Team and ranked as ‘first-team’ at least one time. This implies that the

research only included top analysts whose buy-sell-hold recommendations would likely elicit

strong reactions by market participants.

The setting of Asquith et al.’s (2005) [13] research facilitated pursuing another goal which

was studying the association between the target price accuracy and the respective valuation

technique. The study reports that there is no significant association between the accuracy of

target price forecasts and the techniques used to arrive at these estimates, provided that the set

of techniques is limited to revenue or earnings multiples or DCF. Moreover, the choice of

valuation method was shown to be as ineffective in driving market reaction. Interestingly, the

research reports reduced accuracy in forecasted target prices if analysts use other techniques

such as Economic Value Added (EVA) or some analyst-specific techniques, uncommon

among analysts.

Worthy of note here is the definition of ‘accuracy’ in Asquith et al. (2005) [13]: if the

forecasted target price or a higher price is realized in the one-year period pertaining to the

forecasted target price, the target price is considered ‘accurate’. In this study, the analysts’

projected target prices were accurate in 54% of the cases. The projection accuracy can be

considered satisfactory for the rest of the cases where, on average, 84% of the target price was

observed in the market. Section 3.2.4 highlights more details from this study regarding the

distribution of valuation methods amongst financial analysts.



32

3.2.3.2. Evidence from French IPOs

The previous section (Section 3.2.3.1) demonstrated that in non-IPO cases, the DCF

technique and comparable multiples approach (particularly, earnings multiples) bear similar

accuracy levels. The research on IPO cases, on the other hand, is sparse; the underwriters,

particularly in the U.S., do not often disclose the valuation details, and information such as

cash flow forecasts and alike is seldom filed with SEC (Kaplan and Ruback, 1995 [82]). In the

IPO literature, only few studies could access the underwriters’ reports containing the details of

their valuation methodologies. Roosenboom (2007 [120], 2012 [121]) has perhaps conducted the

most comprehensive study in this regard.

The objective of Roosenboom’s (2012) [121] study was to document the methodologies used

by underwriters to compute ‘fair’ values of the IPOs. Moreover, the study elucidates the IPO

pricing process which shows all the modifications applied to the fair value estimate prior to

setting the final IPO offer price. To achieve the research goals, Roosenboom (2012) [121]

scrutinized the underwriter reports of 228 French IPOs which issued in the 1990-1999 period.

He reports the following three methods as the most frequently employed ones by underwriters:

(i) comparable multiples, (ii) dividend discount model (DDM), (iii) DCF. Underwriters may

assign a greater weight to a technique based on idiosyncratic characteristics of the firm of

interest or aggregate market conditions (Roosenboom, 2007 [120]; 2012 [121]); however, overall,

the comparable multiples technique, used in more than 87% of the cases, remains the most

frequently used technique. The bias and accuracy of these methods are compared below. It

should be noted that to prevent any sampling bias which can skew the outcomes in favor of

one method, any pairwise comparison between the techniques is conducted across only the

firms which employed both techniques.

Roosenboom (2012) [121] calculated ‘bias’ as the ratio of (value estimate - equilibrium

market value) to (equilibrium market value). He documented that all three techniques lead to,

on average, positively-biased outcomes; however, there is no significant difference between

the techniques in terms of bias. As with the case of ‘bias’, this study does not detect any



33

significant difference between the ‘accuracy’ of the three methods; accuracy is defined as the

absolute value of bias.

Concisely, despite the sparse public information on underwriters’ valuation details, the

scrutiny of the reports of 228 French IPOs provides firm evidence that, at least in the case of

French IPOs, the comparable multiples method and the computationally intensive DCF

method exhibit similar accuracy and bias in predicting equilibrium market values.

3.2.4. Comparable-Based Method: Prevalent in the U.S.

The comparable multiples method is the most commonly used valuation method in the U.S.

perhaps due to its more intuitive approach to valuation, yet, roughly identical outcomes to the

mathematically demanding DCF method. The remainder of this section is focused on the

studies which further support this fact.

According to the study conducted by DeAngelo (1990) [46], the comparable multiples model

is favored by bulge bracket investment banks such as CS First Boston (CSFB). In one of its

reports, CSFB documented the sensitivity of the DCF method to the terminal value estimates

and indicated that they favored the earnings multiple model over the DCF method.

Asquith et al.’s (2005) [13] study, introduced in Section 3.2.3.1, provides important

information regarding the distribution of the valuation methods amongst All-American

Research Team members, considered to be top U.S. analysts whose reports and buy-sell-hold

recommendations can lead to changes in the direction of the market. As indicated, the primary

objective of Asquith et al.’s (2005) [13] study was to create a unique database which collects

and categorizes the details provided in the analyst reports such as target prices and valuation

techniques. To accomplish this goal, the researchers scrutinized the equity reports of 1126

first-team ranked sell-side analysts. The reports covered the period of 1997 to 1999.

The assessment of the reports reveals that the earnings multiple method, employed by

99.1% of the analysts, is the most frequently used method; whereas, the usage of DCF or ‘any

of the other discounted cash flow approaches favored by finance textbooks and MBA



34

curriculums’ is not as dominant. Attesting to this fact is the figure of only 12.8% of the

analysts reporting the use of this method. The DCF method is employed more frequently if

downgrading is recommended. The research indicates that it is used in 20.8% of the

downgrade reports. Worthy of recalling here is the accuracy of target prices forecasted by

these analysts (refer to Section 3.2.3.1 for further details): The analysts forecasted target prices

‘accurately’ in 54% of the cases. The ‘accuracy’ of the remaining cases (i.e., 46%) can be

considered satisfactory since, on average, 84% of the target price was observed in the one-year

period corresponding to the target price forecast.

Houston et al. (2006) [70] is another study which indicates the prevalence of the comparable

multiples method among the U.S. analysts. This study was, however, conducted in a different

setting than Asquith et al.’s (2005) [13] research and focused on the initiating coverage reports

for IPO firms. One of the main objectives of this study was to investigate the methodologies

applied by the analysts in arriving at their target price estimates. Note that the target price of a

stock is an estimation of its price one year from the date the report is released. To perform this

analysis, the study scrutinized 454 reports initiating coverage on 248 firms whose IPO took

place during the 1996-2000 period. In order to be included in the study, a report had to be

published sometime during the 115-calendar day period following the respective firm’s issue

date. The research revealed that the DCF method was used in 10% of the reports. The

comparable multiples method was applied in a notably higher number of cases (i.e., 73% of

the reports) with earnings multiples being the most dominant. Furthermore, the research shows

that 87% of the firms were linked to at least one analyst report which used the comparable

multiples method to estimate the respective target prices. Houston et al. (2006) [70] also

examined the association between the characteristics of a firm and the valuation method

preferred by the analysts for forecasting target prices. The results indicate that the comparable

multiples method is preferred to other mathematically rigorous methods in case of young and

small businesses which pose many valuation challenges.

The studies presented in this section (Section 3.2) have documented that the method of

comparables is utilized frequently and yields estimates which are accuracy-wise comparable to


3.3. Comparable Firm Selection

35

other analytically rigorous valuation techniques. The accuracy of the comparables approach in

predicting prices can be attributed to two different sources: (i) Value estimates derived with

reference to comparables incorporate market data (i.e., trends, insights, etc.) which are hard to

account for with other valuation methods (see, Section 3.2.1), (ii) The methodology is the

most frequently applied in the first-team ranked analysts’ reports. These reports initiate

significant market reaction and may impact the stock trends, skewing them toward the

recommended direction. Researchers have documented such impact of analyst reports on the

stock prices (e.g., Brav and Lehavy, 2003 [32]; Houston et al., 2006 [70]). Considering the

dominance and accuracy of the comparable-based approaches in predicting prices, they could

similarly offer a suitable ground for projecting future distributions of prices, and hence,

estimating the short-term risk. The process is detailed in Chapter 6.


As established in the previous section, a comparable-based methodology is adopted in this

thesis to quantify the short-term risk of IPOs (also see, Chapter 1). One of the challenges of

this approach is identifying appropriate comparable firms to the firm under consideration. This

section reviews the existing methodologies for selecting comparables and notes any

shortcoming that may make the respective technique incompatible with the objectives of this

research.

A comparable firm, which is often selected from the industry of the firm of interest, is

expected to bear a resemblance to the firm of interest in terms of risk, future growth rates, and

earnings (Draho, 2004 [48]). The approaches adopted in the literature to find comparables are

quite diverse; yet, they fail to match all the above-indicated aspects, simultaneously. The

studies by Boatsman and Baskin (1981) [29] and Alford (1992) [4] are presented below as cases

in point.

Boatsman and Baskin (1981) [29] studied two different approaches for selecting a

comparable firm: (i) selecting a firm randomly from the corresponding industry, (ii) examining



36

the 10-year mean growth rate of earnings for individual firms from the same industry and

selecting a firm with the most similar mean. Using the P/E ratio, the second approach yielded

smaller price prediction errors.

Alford (1992) [4] examined three different approaches for choosing a comparable firm: (i)

based on industry using 4-digit SIC codes, (ii) based on risk which is proxied by size (i.e., total

assets), (iii) based on earnings growth using book return on equity (ROE) as its measure. The

study examined three different combinations of these approaches as well (i.e., industry and

size, industry and growth, and growth and size).

Using the P/E multiple to test the above-indicated approaches, Alford (1992) [4] concludes

that the industry approach and the combination of risk and earnings growth produce

comparable outcomes in terms of accuracy in predicting stock prices. The pairwise

combination of the industry approach with the approach of risk or earnings growth does not

significantly increase the accuracy of predictions. This implies that industry indeed

incorporates the impacts of risk and earnings growth. Selecting comparables using risk or

earnings growth individually does not provide effective solutions, either. Overall, the industry

approach is recommended in this study. Furthermore, it is noted that selecting comparable

firms based on industry yields more accurate valuation outcomes for larger than smaller firms.

Choosing comparables for an IPO firm is more complicated. The limited history available

at the IPO stage restricts use of factors such as long-term mean growth rate of earnings, used

by the studies presented above. The work of Kim and Ritter (1999) [87] was focused on valuing

IPOs and comparing the outcomes to the offer prices. The sample consisted of 190 IPOs from

the 1992-1993 period and the valuation technique was selected to be the comparable multiples

method. Two groups of firms were used as comparables: (i) firms that had recently gone

public in the same industry, (ii) firms selected by Renaissance Capital [117] which is a

‘boutique’ [87] research firm specializing in IPO valuation.

The first approach algorithmically chooses comparables based on the 4-digit SIC code.

Although this technique can only be as accurate as the SIC classification is, it hedges against



37

biases in the selection process. As an illustrative instance, consider hot issues markets where

underwriters have the incentive to manage the comparable firms selection in order to justify

their high estimations of the P/E multiple. The selection approach of Renaissance Capital, on

the other hand, is based on the information provided in the prospectuses about the firm’s direct

competitors. The prediction error is reduced when using the second approach.

Another study in the IPO context is conducted by Purnanandam and Swaminathan (2004)

[114]. Their primary objective was somewhat similar to Kim and Ritter’s (1999) [87]: they

investigated whether IPOs are overpriced with respect to the offer price. To this end, the study

compared the offer prices to the values obtained using price multiples and concluded that the

median IPO in their sample of more than 2000 IPOs (1980-1997) is overpriced. As to the

process of choosing comparable firms, Purnanandam and Swaminathan (2004) [114] adopted a

different approach than Kim and Ritter (1999) [87]. The comparable firm, which is a non-IPO

firm, is selected such that it matches the IPO of interest in terms of industry, sales, and

EBITDA profit margin. The study assumes that industry-matching would restrict the sample of

candidates to those firms with some potential of matching the IPO of interest with respect to

operating growth prospects, risk and profitability. Factors representing the operating

performance could be extended beyond sales and EBITDA profit margin. However, the

methodology suggested by Purnanandam and Swaminathan (2004) [114] cannot handle an

increased number of such factors; due to its sequential nature, as the set of factors is

broadened, the chance of finding a single comparable decreases. The matching methodology

can be described as follows:

The study classifies the firms based on the industry classification system of Fama and

French (1997) [50]. They used 4-digit SIC codes to cluster firms into 48 industries. Each

industry group is divided into three sales portfolios based on sales and, each sales portfolio, in

sequence, is divided into three margin portfolios based on EBITDA profit margin. The

comparable firm to the IPO of interest is then selected from the margin portfolio of the IPO of

interest and is defined to be the firm with the most similar level of sales to the IPO of interest.



38

Industry-based approaches approximate the performance of the firm of interest with the

overall characteristics of its industry. These approaches disregard firm-specific details such as

operating and financial characteristics. In theory, the comparable firm must resemble various

aspects of the firm under scrutiny such as the product or service, profitability, short-term

solvency, turnovers, and capital structure or financial leverage.

A comparable firm which satisfies all the above-indicated criteria seems unachievable in

practice. However, a sufficiently similar firm can be identified based on a multi-dimensional

comparison between important characteristics of the firm under scrutiny and other candidate

firms. The study of Purnanandam and Swaminathan (2004) [114] extends the basis of the

comparable selection process beyond using only industry multiples, by additionally accounting

for similarities in sales and EBITDA profit margin. However, as indicated above, due to the

sequential nature of their approach, it is impractical to base the comparison on a broad

category of factors. The review of literature returns two studies whose proposed algorithms

take into account multiple accounting and financial factors in choosing comparables: (i)

Bhojraj and Lee (2002) [25], (ii) Simak (2000) [133]. This section is terminated by concisely

describing the two studies.

In addition to developing an algorithm which identifies comparables primarily based on

multiple financial variables, Bhojraj and Lee’s (2002) [25] study stands apart from the prior

studies in two other respects:

(i) The previous studies selected the most suitable multiple as the one which best

explains the stock price of the firm of interest (e.g., Price-to-Earnings multiple). The

study by Bhojraj and Lee (2002) [25], on the other hand, chooses ex ante the

enterprise value-to-sales (EVS) and price-to-book (PB) ratios because these two

ratios are deemed as representatives of intrinsic value and can be obtained by

restructuring the residual income model and the dividend discount model,

respectively.



39

(ii) Despite prior studies which tested the comparable-based method by its capacity to

predict the stock price accurately, the study of Bhojraj and Lee (2002) [25] examined

the accuracy of the method based on its capacity to predict the EVS and PB

multiples. The reason is that this study is concerned with the intrinsic value of the

firm of interest and aims for developing a comparable-based methodology which

can be utilized to detect over- or under-priced stocks. Therefore, the stock price

itself does not present a suitable reference for gauging the accuracy of the method.

The methodology suggested by Bhojraj and Lee (2002) [25] identifies the comparables using

regression which clarifies why it can incorporate multiple accounting and financial factors.

Moreover, using regression in conjunction with choosing the multiples of EVS and PB as the

value indicators make the methodology applicable to the case of unprofitable firms with

negative earnings. Note that the widely used P/E multiple is not suitable in the case of negative

earnings. Bhojraj and Lee (2002) [25] adopted the following steps to identify peers. As will

become clear, the methodology is objective and, therefore, can be automated.

First, the cross-sectional variation of EVS or PB is modeled as a function of eight

explanatory variables, all pertaining to the same year but may be spanning different industries;

that is, data belong to firms from different industries. Considering that the PB and EVS ratios

can be obtained by restructuring the residual income and dividend discount models, the set of

explanatory variables comprises the ratios which either directly appear in these two models

(e.g., return on equity (ROE)) or act as the underlying factors of the ratios which directly

appear in the two models (e.g., return on net operating assets defined as operating profit

divided by net operating assets). In order to estimate a firms’ current or future EVS or PB

ratio, the study uses the regression equation resulted from the last year’s data. The estimated

EVS or PB is termed ‘warranted’ EVS or PB.

Second, Bhojraj and Lee (2002) [25] proposed five different ‘definitions’ for comparable

firms. In order to assess the significance of each of these definitions in providing the current or

up to three-year forward estimates of EVS or PB, the two ratios were regressed on a set of



40

explanatory variables. Each member of this set is the representative of one definition of

comparable firm. The members can be listed as follows: (i) ‘harmonic’ mean of the actual (not

warranted) EVS or PB ratios of the firms which are classified under the same industry as the

firm under scrutiny. In the study, a firm’s industry is determined based on its 2-digit SIC code.

The harmonic mean can be computed by taking the inverse of the mean of the inversed ratios

(e.g., EVS’s), (ii) harmonic mean of the EVS or PB ratios of the four firms whose market

capitalizations remain closest to the firm under scrutiny and whose industry is the same as the

firm of interest, (iii) the warranted EVS or PB ratio of the firm under scrutiny itself, (iv)

harmonic mean of the EVS (PB) ratios of the four firms with the closest warranted EVS (PB)

ratios to the EVS (PB) ratio of the firm under scrutiny, (v) harmonic mean of the EVS (PB)

ratios of the four firms whose warranted EVS (PB) ratios remain closest to EVS (PB) ratio of

the firm under scrutiny and whose industry is the same as the firm of interest as well.

The results from the second stage demonstrate that including warranted multiples (i.e.,

definitions (iii) to (v) discussed above) significantly improves the predictive power of the

regression equations. Bhojraj and Lee (2002) [25] thus concluded that using comparables which

are selected based on the proximity of the warranted multiples leads to more accurate

predictions of EVS or PB. It should, however, be noted that the outputs of the PB regression is

not as satisfactory as the EVS model.

Despite the merits of Bhojraj and Lee’s (2002) [25] approach, the following prominent

limitations are associated with their methodology:

(i) This method is regression-based. In consequence, it incorporates all the limitations

associated with the regression framework. As a case in point, regression is a

parametric approach which requires an ex ante specification of the relationship

between the dependent variable (or output) and explanatory variables (or inputs).

The accuracy of regression estimates is thus not only affected by the choice of

explanatory variables but by how they are related to the dependent variable. This

limitation is even more critical in the case of stock prices. Not only is the stock price



41

associated with a broad category of value drivers but the nature of this association is

complex and may be quite arbitrary. In such cases, non-parametric methods offer a

more concrete and integrative approach since they endogenously characterize the

link between the inputs and outputs and do not require a priori knowledge of the

functional relationship between them. Section 6.2 provides more details on selecting

comparables based on financial and accounting metrics and using a non-parametric

method called data envelopment analysis (DEA).

(ii) Outliers can substantially distort the regression fit. In order to mitigate the effect of

outliers, Bhojraj and Lee (2002) [25] excluded extreme low and high ends of the data.

While this step may improve the outcomes in case of the firms positioned close to

typical observations, it can adversely impact the comparable selection for extreme

firms as they may happen to match excluded outliers more closely.

(iii) The effectiveness of choosing comparables based on the proximity of EVS or PB

multiples is benchmarked against the accuracy of the EVS or PB forecasts yielded

by these comparables. Therefore, the capacity of the methodology to estimate prices

accurately remains indefinite. Hence, this approach does not fulfill the objectives of

this thesis. Considering that our research centers upon developing a risk measure,

which is essentially price-based, it is the price estimation efficacy that should

determine which comparable selection algorithm is indeed the most suitable.

(iv) In accordance with the preceding point, the accuracy of Bhojraj and Lee’s (2002) [25]

method is influenced by the choice of multiples. As discussed earlier, choosing

comparables based on warranted multiples leads to satisfactorily accurate

predictions only for EVS; the PB outcomes, although improved, remain far from

their market realizations.

(v) The methodology involves some details which render its use restricted in the case of

IPOs:

(a) The second regression described on Page 39 incorporates an explanatory

variable (ISEVS or ISPB) which can be computed as follows: The



42

market capitalizations of all the firms from the same industry as the firm

of interest are ranked and four firms with the closest market

capitalizations to that of the firm of interest are selected. The harmonic

mean of the EVS or PB ratios of these four firms represents a value for

ISEVS or ISPB, respectively. As inferred from the context, this

approach is inapplicable to IPOs as no market information exists for IPO

firms.

(b) Bhojraj and Lee’s (2002) [25] regression model requires data from two

consecutive years (i.e., last and current year) in order to estimate

warranted multiples for each firm under scrutiny. This requirement of

the model may not be fulfilled considering the data scarcity at the IPO

stage.

(vi) Bhojraj and Lee’s (2002) [25] first regression model includes consensus earnings

growth forecasts provided by I/B/E/S. Analysts’ forecasts are often subjective and

may be optimistically biased (Houston et al., 2006 [70]). Their use must, therefore, be

accompanied by a great caution; i.e., such forecasts should be scrutinized in light of

the analysts' motivations, particularly agency-driven incentives and local politics

(e.g., potential future business opportunities between the firm of interest and the

institution bound to the analyst).

(vii) The algorithm is multiples-based; that is, it draws on ratios to find comparables.

Basing the analysis on ratios as opposed to absolute values brings along the

disadvantages associated with ratios; e.g., by dividing two accounting factors, the

individual pieces of information contained within each figure are lost. Likewise, the

division may lead to the loss of economies of scale.

(viii) The cross-sectional regression which produces warranted multiples is calibrated

based on firms from different industries whose accounting data pertain to the same

year. This model is essentially regressing EVS or PB on its potential value drivers.

Calibrating the regression equation using cross-industry data would assign greater



43

weights to general value drivers and would disregard or assign less significance to

the value drivers specific to a particular industry. Negative earnings in the

biotechnology industry can be considered as an example for this point. Negative

earnings - arisen from, e.g., R&D expenditures - may positively contribute to value

as some investors tend to optimistically view them as investments. This cross-

industry calibration can consequently adversely impact the accuracy of comparable

selection. The reason is that it is eventually the warranted multiples that form the

basis for selecting comparables. These warranted multiples are generated based on

the regression coefficients that reflect the cross-industry general perspective on the

value drivers and disregard any industry-specific perspective.

As indicated above, Simak (2000) [133] is another study which also draws upon multiple

accounting criteria to choose comparables. Simak’s (2000) study - which appears to be

developed independently from Bhojraj and Lee (2002) [25] research - addresses the problem of

choosing comparables using a DEA model (see, Chapter 4 for more details on DEA). Since

DEA is a non-parametric approach, it resolves the regression-related issues existing in the

methodology proposed by Bhojraj and Lee (2002) [25]. Furthermore, Simak’s (2000) [133]

methodology was originally developed in the context of private firm valuation, and hence,

seems to be more compatible with the IPO framework. It should also be noted that Simak’s

(2000) [133] approach utilizes financial factors in ‘absolute’ format, as opposed to multiples,

which eliminates the ratio-related disadvantages pointed above (see, shortcoming (vii)).

Simak’s (2000) [133] methodology is, however, a semi-objective approach and cannot be

completely automated. Moreover, there exist other methodology-specific limitations which

will be discussed below once the technique is introduced.

As pointed out above, the objective of Simak’s (2000) [133] study is to find the market value

of a private firm. Therefore, after choosing the most comparable publicly traded firm to the

private firm under consideration, the market value of the private firm is approximated by the

market value of the comparable public firm.



44

In order to quantify ‘similarity’, Simak (2000) [133] developed a similarity metric using

DEA. The analysis is summarized below. The reader is referred to Chapter 4 for more details

on DEA and the respective parameters.

Step 1. Using the input-oriented DEA model of CCR or BCC, the ‘efficient frontier’ is

identified for a set consisting of the private firm and candidate public firms. The input

matrix includes variables such as total assets and total liabilities. The outputs can be

shareholders’ equity, net sales, net income and net cash flow. Based on the outcomes

of the DEA model, the firms with the same ‘reference set’ are clustered together.

Step 2. Simak (2000) [133] suggests a metric, referred to as ‘difference indicator’, for

gauging the difference between the private firm and other firms in the cluster. The

metric is defined as shown in the following equation:

𝛿𝛿𝑖𝑖𝑖𝑖 = ��𝜆𝜆𝑖𝑖𝑖𝑖 − 𝜆𝜆𝑖𝑖𝑖𝑖�2

(3.1)

Eq. 7.1 [133]

where, 𝛿𝛿 represents the difference indicator. The private firm and the corresponding

public firm are designated with the subscripts 𝑖𝑖 and 𝑗𝑗, respectively, and 𝑘𝑘 denotes the

efficient firms in the reference set and implies that the firms 𝑖𝑖 and 𝑗𝑗 belong to the same

cluster.

Step 3. The difference indicator is indeed a measure which shows the similarity in the

values of 𝜆𝜆. In other words, it shows how closely the ‘efficient targets’ of the two firms

are positioned on the efficient frontier. It is possible to find two firms located distant

from each other in space, yet, with efficient targets situated close to each other, leading

to a small value of difference indicator. This case indicates the necessity for another

dimension to complete the comparison. To this end, the firms which have been

identified to have similar 𝜆𝜆s, using the equation of difference indicator, are then



45

compared in terms of the radial ‘efficiency score’ (𝜃𝜃). Each firm’s efficiency score can

be considered as a measure of the firm’s distance from the frontier. The comparison of

𝜃𝜃s would, therefore, ascertain that the two firms are positioned close to each other in

space.

Anadol (2000) [7] put Simak’s (2000) [133] proposed methodology into test using the 1998

accounting data of 457 Canadian public companies from the manufacturing sector, as

classified under the SIC system. Different sets of analyses were conducted in her study. In one

set, 62 ‘inefficient’ companies from the SIC cluster of 3600s were assessed. The results

showed that in 49% of the cases, the market capitalization of the respective company falls

within the range yielded by the methodology. In 13% of the cases, the estimated market

capitalization was reported to be within the 25% of the realized market capitalization. The

reader is referred to the original study for further statistics.

In Anadol’s (2000) [7] thesis, the input-oriented BCC model was chosen (over CCR) to

conduct the analysis. BCC allows for variable returns to scale which is essential when the

analysis incorporates firms of various sizes. Moreover, the input-oriented BCC model can

handle negative outputs. Since in Anadol’s (2000) [7] dataset, only outputs (i.e., net sales and

net income) included negative numbers, the input-oriented BCC model suited the framework

of her analysis.

Despite the advantages listed above, the following limitations are associated with the

algorithm suggested by Simak (2000) [133]:

(i) The methodology is not applicable to the cases where the private firm under

scrutiny is ‘efficient’ itself. Anadol (2000) [7] mitigated this problem by proposing

an approach which only determines a lower bound for the market value of the

efficient private firm. The lower bound is computed as the weighted sum of the

market capitalizations of the inefficient public firms whose sets of efficient peer(s)

include the private firm of interest. The weights are defined as the normalized 𝜆𝜆s of

the inefficient public firms associated with the efficient private firm.



46

It should be noted that the complication associated with ‘efficient’ firms of

interest still remains untackled. Anadol’s (2000) [7] method only offers an

approximate solution with no reference to the theory. The reasons are specified

below:

In Simak’s (2000) [133] methodology, 𝜆𝜆’s and 𝜃𝜃 represent a new set of coordinates

which determine the position of a DMU in the new space of 𝜆𝜆’s and 𝜃𝜃. The

parameters 𝜆𝜆 and 𝜃𝜃 for each DMU are determined based on the ‘efficient’ peers.

Therefore, for a group of DMUs which do not share identical efficient peers, the 𝜃𝜃

and 𝜆𝜆 factors would not be comparable. This argument clarifies where Anadol’s

(2000) [7] approach conflicts with the theory of DEA: In determining the lower

bound for an ‘efficient’ firm of interest (FOIe), Anadol’s (2000) [7] methodology

draws on any ‘inefficient’ firm whose set of efficient peers includes (but is not

limited to) FOIe. In other words, these inefficient firms are not associated with an

identical set of efficient peers and hence, their 𝜆𝜆 (or 𝜃𝜃) values cannot be examined

against each other.

(ii) The BCC model is a radial model and the radial efficiency score yielded by this

model (i.e., 𝜃𝜃) does not incorporate all sources of inefficiency (see, Chapter 4).

Thus, in the presence of ‘slacks’, the parameter 𝜃𝜃 would not provide a correct

representation of the distance of the respective firm from the efficient frontier.

(iii) The study does not formulate the trade-off between 𝜃𝜃 and 𝜆𝜆. It is not clear how

much of the similarity in 𝜆𝜆s should be forgone to achieve similarity with respect to

𝜃𝜃s. This issue, which arises in Step 3 (see, Page 44), was handled manually and

subjectively by Simak (2000) [133] and Anadol (2000) [7]. In order to obtain an

absolutely objective measure for similarity, the compromise between 𝜃𝜃 and 𝜆𝜆 needs

to be modelled.

The algorithm of Simak (2000) [133] (examined by Anadol (2000) [7]), should be extended

and evolved into a new and more comprehensive and analytically rigorous framework which


3.4. Factors Impacting the IPO Price

47

not only addresses the aforementioned limitations but also provides enhanced features and

functionality and ensures a more reliable performance. Section 6.2 details how this thesis

develops a new methodology for selecting comparables using DEA.


Prior to delving into the discussion of choosing comparables, it is essential to understand

which factors impact IPO offer prices as well as post-IPO short-term market prices. If two

firms are similar with respect to the factors deemed to impact investors’ opinion about prices,

they may experience comparable market performances as well. Hence, the outcome of this

discussion is required to underline the factors which would form the basis of the comparison

among different IPOs.

The fundamental challenge in pricing the shares of an IPO or a public company is that there

is no well-defined ‘true’ value for the firm. Even pricing a firm’s assets is not a trivial task.

Usually, every firm has some intangible assets like intellectual property, patents, copyrights,

trademarks, etc. which are not easy to price. The value of a firm (and hence its share price)

depends on the ‘perception’ of the investors about its future performance and profitability

derived based on the information available today.

The value drivers of an IPO firm can be classified in two broad categories, accounting

information and non-financial metrics (Draho, 2004 [48]). Accounting information is available

through financial statements. Non-financial metrics represent pieces of information which are

often hard to quantify or to be directly incorporated into the valuation models; yet, they impact

investors’ opinion about expected financial performance of issuers. The ‘Use of Proceed’

section of a prospectus, for example, provides insight into future business activities; whether

the issuer plans to use the IPO proceeds for acquisition or to invest in an R&D project or to

launch a new product, or to cover past debts would lead to different interpretations of the

firm’s growth prospects. The list can be extended to factors such as the share retention ratio by



48

pre-IPO shareholders, underwriter reputation, possible risk factors indicated in the

prospectuses, etc.

Although focused on public firms, Ohlson (1995) [103] was one of the first few scholars who

attempted to model market value of a share as a function of accounting metrics (Draho, 2004

[48]). Assuming ‘clean surplus accounting’, Ohlson (1995) [103] built an autoregressive model to

express the market value of equity. The research indicates that the value of an equity today is

derived based on the book value and residual income information provided in the most recent

financial statements. Any difference between the realized and estimated equity prices is

attributed to ‘non-accounting information’ [48] impacting future residual incomes and is

incorporated into the noise term of the model.

Since the onset of internet industry in 1994 (Bartov et al., 2002 [19]), one possible

classification of the literature on stock valuation/pricing can be internet and non-internet. High

market value ratios (e.g., market-to-book), high revenue growth expectations, and negative

income at the IPO time (particularly in the late 1999) introduce new challenges to the

conventional valuation literature and require diligence.

Using 295 public internet companies from the period of January 1996 to June 2000, Jorion

and Talmor (2001) [81] studied the impact of financial and non-financial data on the market

value of internet companies. They employed different regression models to examine the

impact of explanatory variables of book value, earnings, and web traffic, on the dependent

variable of market value. All variables were assumed to be synchronous. The former two

independent variables represent accounting information and the third explanatory factor is

considered a non-accounting factor. These accounting and non-accounting explanatory factors

are decomposed into their underlying components for a more detailed scrutiny of the relevance

of the aforementioned independent factors to the market value of equity. The variable

earnings, as an example, is decomposed into gross profit, sales, general, and administrative

expenses (SGA) and R&D expenditures. To examine the impact of growth rates on the

explanatory power of the regression models, the study augments the list of explanatory



49

variables by the addition of the following two variables: scaled growth in sales and scaled

growth in SGA over the most recent quarter.

The study identifies statistically significant relevance between the market value of equity

and the financial explanatory factors such as gross profit, scaled growth in sales and scaled

growth in SGA over the most recent quarter. However, the non-financial factor (i.e., web

traffic) expressed a more significant impact. The web factor was decomposed into the

following components: total number of times the page was viewed and total number of hours

the page was viewed. With further scrutiny, however, Jorion and Talmor (2001) [81] detected a

declining temporal trend in the impact of the non-financial factors on the market value of

equity and a growing temporal trend in the value relevance of the financial factors.

Considering that the time period of this study coincides with the dot-com bubble period (from

its birth to its burst), such trends indicate that as the frenzy of new economies abates and the

industry matures, the financial factors resume the significance they hold in the conventional

valuation framework.

Hand (2000) [65] reported comparable outcomes to Jorion and Talmor (2001) [81] in his

study of price-relevance of accounting information. He examined the quarterly financial

information of 167 ‘net’ public firms in the 1997-1999 period. His findings indicate a

significant positive relationship between the market value of equity and the following factors:

(i) book equity, (ii) positive net income, (iii) revenue, (iv) selling and marketing expenses or

R&D expenditures given negative net income. Among these factors, the market value of

equity is linearly increasing in the former (i.e., book equity) and log-linearly increasing in the

rest. It should, however, be noted that revenue which is a component of the net income was

found to be weakly related to the market value of equity. In short, it can be concluded that

regardless of the type of the relationship, accounting variables which are indicative of the

operational and structural characteristics of a firm remain the key value-drivers of stocks.

A subsequent more detailed study conducted by Hand (2001) [66] on R&D expenditures of

young biotechnology firms showed a significant positive market reaction to R&D expenditures



50

in this industry. The younger the firm is and the faster it grows, the more pronounced this

positive impact is. The evidence from the biotechnology industry is promising as it indicates

that even the market valuation of young biotechnology firms, which have limited transparency

of income and sales, demonstrates significant dependency on financial metrics.

The discussion has so far focused on the key value drivers of stock prices in general which

also apply to the post-IPO market prices. The necessity of such discussion can be outlined as

follows: Any algorithm for identifying comparables would require a set of factors to act as the

basis of comparison. Considering that the comparables would eventually serve to estimate the

distributions of post-IPO market prices for the IPOI, it is essential to ensure that the

foundation of comparison bears significant relevance to post-IPO market prices. Only in this

case can the post-IPO market history of a comparable provide insight into the post-IPO market

status of the IPOI. It should, however, be noted that this study is focused on quantifying the

short-term risk of investing in IPOs. Thus, the time of concern remains close to the issue day.

It is, therefore, interesting to scrutinize the factors which may be influential on the offer price

of an IPO or its immediate post-IPO market prices such as the closing price of the first day. In

their seminal study focused on valuing IPOs using accounting data, Kim and Ritter (1999) [87]

addressed this avenue of the research.

One of the objectives of Kim and Ritter’s (1999) [87] study was to examine a series of

accounting ratios in order to identify those leading to more accurate predictions of IPO offer

price. They studied 190 domestic operating firms over the 1992-1993 period. Firms were

selected to have positive earnings per share (EPS) over the immediately preceding year of the

issue day and positive book value per share (BPSpreissue) prior to the issue day. Kim and Ritter

(1999) [87] adopted the comparable multiples methodology to predict IPO offer prices. Using

least squares linear regression, the study regressed multiples of an IPO (e.g., P/E, M/Bpostissue)

on the same multiples of the respective comparables. The study reported moderate prediction

accuracy for the following multiples: price-to-earnings, price-to-sales, enterprise value-to-

sales, enterprise value-to-operating cash flow, and market-to-book. It is, however, noted that

replacing earnings with the forecast of earnings would increase the prediction accuracy.



51

Although Kim and Ritter’s (1999) [87] primary goal was to gauge the accuracy of the

comparable multiples technique in estimating offer prices, their results pose implications for

determining value drivers of equity market prices as well. Based on this study, it can be

concluded that earnings per share, sales per share, book value, and operating cash flow play a

material role in forming investors’ opinion about stock prices. It is worthy of note, however,

that due to the shortcomings associated with ratios, absolute accounting measures are preferred

over ratios (Certo et al., 2009 [33]). Loss of economies of scale and loss of information are

some of the common challenges related to ratios.

Based on his investigation of the pre-IPO underwriter reports of 228 French IPOs,

Roosenboom (2012) [121] provides the following list of the most commonly used multiples by

the underwriters in computing the estimates of ‘fair value’ of IPOs: price-to-earnings, price-to-

cash flow, price-to-sales, price-to-book, enterprise value-to-sales, enterprise value-to-earnings.

His list is in line with Kim and Ritter’s (1999) [87] study; Roosenboom (2012) [121], likewise,

highlights the potentially significant role of the following variables in determining value:

earnings per share, sales per share, book value, and operating cash flow.

In a different setting, Zhu (2000) [141] also recognized profit and revenue as the key factors

impacting the market performance of a company. The primary objective of his research was to

develop a methodology for gauging the financial performance of the Fortune 500 companies

by integrating the existing diverse measures, serving this purpose, into a single framework. He

accomplished this goal using DEA. His two-stage DEA model consists of two sub-models: (1)

profitability model, (2) stock marketability model. The inputs and outputs of each stage can be

taken to discern the key factors impacting a company’s profitability and market performance.

In the first stage, it is assumed that profit and revenue are primarily impacted by the following

three factors: (i) number of employees, (ii) assets, (iii) stockholder’s equity. In the second

model, profit and revenue, in turn, serve as the key factors impacting the market performance

proxied by (i) market value, (ii) total return to investors, (iii) earnings per share.



52

As aimed at the onset of this section, a search was conducted to assemble key value

indicators from different studies. These factors which can serve as the comparison criteria in

choosing comparables can be summarized as follows:

(i) Revenue

(ii) Gross profit

(iii) Selling, general and administrative expenses

(iv) Net income

(v) Shareholders’ equity

(vi) Operating cash flow

53


DATA ENVELOPMENT ANALYSIS

4.1. Introduction

This chapter reviews the literature on DEA as pertinent to this research. Section 4.2

provides a brief summary on the basic DEA models, such as CCR (CRS), BCC (VRS), and

SBM. This section aims to build the necessary background for discussing more advanced DEA

models which can cope with negative data and ‘non-discretionary’ factors. These two latter

lines of research are respectively unfolded in Sections 4.3 and 4.4. Section 4.3 presents an

overview of existing methods and recent advances in the field of negative data in DEA. The

outcome of this survey is subsequently utilized in selecting a suitable approach for handling

the mix of negative and positive data in this thesis. Section 4.4 scrutinizes the current models

for non-discretionary factors. This review reveals the present gaps and forms the basis for

addressing the existing methodological barriers. Prior to delving into the details of various

DEA models, the following paragraphs provide a general overview of DEA.

Established by Charnes, Cooper, and Rhodes (Charnes et al., 1978 [36]), Data Envelopment

Analysis (DEA) is a nonparametric linear-fractional programming technique which can be

used to evaluate the relative performance of decision making units (DMUs). The performance

of each DMU is represented by a set of inputs and outputs of the respective DMU. The DMUs

are comparable which implies that the same types of inputs and outputs are used across the

DMUs (Cooper et al., 2007 [41]).

DEA determines a frontier of efficient DMUs that ‘envelopes’ the inefficient DMUs. The

frontier which is called the ‘efficient frontier’ is determined endogenously using the set of all

DMUs whose performances are to be evaluated. DEA identifies a ‘reference set’ located on

the efficient frontier as a benchmark for each DMU. The performance of each DMU is

Chapter 4. Literature Review on Methodology: Data Envelopment Analysis

4.1. Introduction

54

evaluated with respect to its reference set. As the outcome of the evaluation, DEA assigns a

performance degree (i.e., ‘degree of efficiency’) to each DMU (Cooper et al., 2007 [41]).

It should be noted that the source and amount of the inefficiency of a DMU are determined

by projecting it on the efficient frontier. The direction of this projection is based on the model

selected and/or the objectives of the analysis. The projection can be input-oriented, output-

oriented or can combine both orientations. The objective of an input-oriented model is to

minimize the inputs without compromising the given level of outputs (i.e., outputs remain at

least at the given levels). An output-oriented model aims to maximize outputs without

exceeding the given level of inputs. Some DEA models such as the additive model or the SBM

model (Sections 4.2.3 and 4.2.4) combine both orientations and simultaneously minimize

inputs and maximize outputs.

The shape of efficient frontier is determined based on the ‘returns to scale’ and convexity

assumptions for the ‘production possibility set’. Production possibility set is a set which

incorporates all the feasible activities (i.e., pairs of semi-positive inputs and outputs denoted

by (𝑥𝑥,𝑦𝑦)). As summarized by Bogetoft and Otto (2011) [30], the production possibility set is

defined to have

(i) constant returns to scale if (𝑥𝑥,𝑦𝑦) ∈ P and 𝛼𝛼 ≥ 0 → (𝛼𝛼𝑥𝑥,𝛼𝛼𝑦𝑦) ∈ P, where P

represents the production possibility set

(ii) decreasing returns to scale if (𝑥𝑥,𝑦𝑦) ∈ P and 0 ≤ 𝛼𝛼 ≤ 1 → (𝛼𝛼𝑥𝑥,𝛼𝛼𝑦𝑦) ∈ P

(iii) increasing returns to scale if (𝑥𝑥,𝑦𝑦) ∈ P and 𝛼𝛼 ≥ 1 → (𝛼𝛼𝑥𝑥,𝛼𝛼𝑦𝑦) ∈ P

(iv) variable returns to scale if 𝛼𝛼 = 1

The most commonly used DEA models can be listed as follows and are discussed next.

(i) Charnes, Cooper, and Rhodes (CCR) model (Charnes et al., 1978 [36])

(ii) Banker, Charnes, and Cooper (BCC) model (Banker et al., 1984 [15])

(iii) Basic additive model (Cooper et al., 2007 [41])

(iv) Slacks-based measure of efficiency (SBM) model (Tone, 2001 [136])


4.2. Basic DEA Models

55

To clarify the essence of DEA, the CCR model (Section 4.2.1) is first presented in the form

of a fractional program. It is then converted to a linear program whose dual problem is

presented in this document. The dual problem is referred to as the envelopment form. The

BCC, additive, and SBM models are shown in the envelopment form as well. It should be

noted that the following sections provide a brief overview of the DEA models which form the

basis for the subsequent discussions. The reader is referred to the original studies for further

details.


4.2.1. CCR Model

Efficiency is often evaluated using Farrell’s measure of efficiency (Farrell, 1957 [51]):

𝜃𝜃 =outputinput

(4.1)

In the case of multiple inputs and multiple outputs, Equation (4.1) can be modified as follows:

𝜃𝜃DMU =virtual outputvirtual input

=weighted sum of outputsweighted sum of inputs

(4.2)

If the weights are set in advance, it would not be clear what fraction of the inefficiency of a

DMU arises from its actual inefficiency and what fraction of it reflects the error of setting

weights. The CCR model developed by Charnes et al., 1978 [36] addresses this problem by

setting variable weights derived endogenously from the data of DMUs; that is, for each DMU,



56

the weights are set as unknowns. Equation (4.3) is then maximized subject to using the same

weights for bounding the efficiency scores of the remaining DMUs to values in the range of

[0,1] (see, Equations (4.4) to (4.6)). The same steps are repeated for each DMU. It is inferred

from this explanation that in addition to finding the weights endogenously, the procedure puts

the performance of each DMU in its best light by setting the corresponding weights such that

the efficiency score is the maximum for that DMU.

To illustrate the above-indicated steps, it is assumed that there are 𝑛𝑛 DMUs (𝑗𝑗 = 1, … ,𝑛𝑛)

which are to be evaluated, and 𝑚𝑚 and 𝑠𝑠 denote the number of inputs and outputs, respectively.

The inputs and outputs of DMU𝑖𝑖 are represented as semipositive vectors of (𝑥𝑥1𝑖𝑖 , … , 𝑥𝑥𝑚𝑚𝑖𝑖) and

(𝑦𝑦1𝑖𝑖 , … ,𝑦𝑦𝑠𝑠𝑖𝑖), respectively. The input and output weights are shown by the 𝑣𝑣 and 𝑢𝑢 vectors,

respectively. The radial efficiency of each DMU is determined using the input and output data

of all the DMUs and after running the optimization presented below:

𝐏𝐏𝐏𝐏𝐏𝐏

max𝑣𝑣,𝑢𝑢

𝜃𝜃𝑜𝑜 =𝑢𝑢1𝑦𝑦1𝑜𝑜 + 𝑢𝑢2𝑦𝑦2𝑜𝑜 + ⋯+ 𝑢𝑢𝑠𝑠𝑦𝑦𝑠𝑠𝑜𝑜𝑣𝑣1𝑥𝑥1𝑜𝑜 + 𝑣𝑣2𝑥𝑥2𝑜𝑜 + ⋯+ 𝑣𝑣𝑚𝑚𝑥𝑥𝑚𝑚𝑜𝑜

(4.3)

Eq. 2.3 [41]

Subject to 𝑢𝑢1𝑦𝑦1𝑖𝑖 + 𝑢𝑢2𝑦𝑦2𝑖𝑖 + ⋯+ 𝑢𝑢𝑠𝑠𝑦𝑦𝑠𝑠𝑖𝑖𝑣𝑣1𝑥𝑥1𝑖𝑖 + 𝑣𝑣2𝑥𝑥2𝑖𝑖 + ⋯+ 𝑣𝑣𝑚𝑚𝑥𝑥𝑚𝑚𝑖𝑖

≤ 1 for 𝑗𝑗 = 1, … ,𝑛𝑛 (4.4)

Eq. 2.4 [41]

𝑣𝑣1, 𝑣𝑣2, … , 𝑣𝑣𝑚𝑚 ≥ 0 (4.5)

Eq. 2.5 [41]

𝑢𝑢1,𝑢𝑢2, … ,𝑢𝑢𝑠𝑠 ≥ 0 (4.6)

Eq. 2.6 [41]



57

The efficiency scores of all the DMUs are computed by running 𝑛𝑛 such optimizations. At each

optimization, the DMU under scrutiny is denoted by DMU𝑜𝑜 and its efficiency score is

represented by 𝜃𝜃𝑜𝑜, where 0 < 𝜃𝜃 ≤ 1.

In a given optimization, if the efficiency score of a DMU is equal to 1 using the calculated

weights, that DMU belongs to the reference set of DMU𝑜𝑜. The efficiency score of DMU𝑜𝑜 (𝜃𝜃𝑜𝑜)

can be equal to 1 as well. A DMU is radially efficient if 𝜃𝜃 is equal to 1 and the model finds at

least one positive pair of weight vectors.

To make the computations less intensive, Equations (4.3) to (4.6) can be converted to a

linear programming problem which is called the ‘multiplier form’ in the literature (see,

Equations 2.7 to 2.11 of Cooper et al., 2007 [41]). The dual problem of the multiplier form

(called the ‘envelopment’ form) is shown below. Note that the production possibility set (P) in

the CCR model is assumed to have the constant returns to scale property discussed in the

preceding section.

Dual problem (Input oriented)

min𝜃𝜃,λ

𝜃𝜃 (4.7)

Eq. 3.6 [41]

Subject to 𝜃𝜃𝑥𝑥𝑜𝑜 − 𝑋𝑋𝜆𝜆 ≥ 0 (4.8)

Eq. 3.7 [41]

𝑌𝑌𝜆𝜆 ≥ 𝑦𝑦𝑜𝑜 (4.9)

Eq. 3.8 [41]

𝜆𝜆 ≥ 0 (4.10)

Eq. 3.9 [41]



58

Here, 𝜆𝜆 ∈ R𝑛𝑛 is a 𝑛𝑛-dimensional semipositive vector. The semipositive input and output

vectors of the DMU under scrutiny (DMU𝑜𝑜) are denoted by 𝑥𝑥𝑜𝑜 and 𝑦𝑦𝑜𝑜, respectively. The

optimal 𝜃𝜃 (𝜃𝜃∗) is a real number from the (0,1] range. The multipliers of 𝜆𝜆, 𝑋𝑋 and 𝑌𝑌, are

defined as matrices which incorporate the data of all the DMUs and can be represented as

follows:

𝑋𝑋 =

⎝

⎜⎛𝑥𝑥11𝑥𝑥21

⋮𝑥𝑥𝑚𝑚1

𝑥𝑥12𝑥𝑥22⋮

𝑥𝑥𝑚𝑚2

⋯⋯⋱⋯

𝑥𝑥1𝑛𝑛𝑥𝑥2𝑛𝑛⋮

𝑥𝑥𝑚𝑚𝑛𝑛⎠

⎟⎞

(4.11)

Eq. 2.1 [41]

𝑌𝑌 =

⎝

⎜⎛𝑦𝑦11𝑦𝑦21

⋮𝑦𝑦𝑠𝑠1

𝑦𝑦12𝑦𝑦22⋮𝑦𝑦𝑠𝑠2

⋯⋯⋱⋯

𝑦𝑦1𝑛𝑛𝑦𝑦2𝑛𝑛⋮𝑦𝑦𝑠𝑠𝑛𝑛⎠

⎟⎞

(4.12)

Eq. 2.2 [41]

The above-indicated dual problem minimizes the inputs while keeping the outputs at least

at the given levels. The reference set for DMU𝑜𝑜 is a set comprising the DMUs with non-zero

𝜆𝜆s. Equation (4.8) implies that the projection of DMU𝑜𝑜 on the efficient frontier can be found

by reducing all the inputs radially from the origin with the scale of 𝜃𝜃 without sacrificing any

output.

The dual problem just considered shows that reducing all the inputs with the same

proportion may put DMU𝑜𝑜 on a terminating segment of the envelope. The terminating

segments are the extensions of the envelope parallel to the axes in the 𝑚𝑚 + 𝑠𝑠 dimensional

space. A DMU on a terminating segment is the indicative of the remaining input excess or

output shortfall in that dimension.



59

To account for the ‘input excesses’ (𝑠𝑠− ∈ R𝑚𝑚) and ‘output shortfalls’ (𝑠𝑠+ ∈ R𝑠𝑠), the

following extra linear programming (Phase 2) is solved using the outcome of the dual problem

presented by Equations (4.7) to (4.10) (i.e., Phase 1):

max𝜆𝜆,𝑠𝑠+,𝑠𝑠−

𝜔𝜔 = 𝑒𝑒 𝑠𝑠+ + 𝑒𝑒 𝑠𝑠− (4.13)

Eq. 3.11 [41]

Subject to 𝑠𝑠− = 𝜃𝜃∗𝑥𝑥𝑜𝑜 − 𝑋𝑋𝜆𝜆 (4.14)

Eq. 3.12 [41]

𝑠𝑠+ = 𝑌𝑌𝜆𝜆 − 𝑦𝑦𝑜𝑜 (4.15)

Eq. 3.13 [41]

𝜆𝜆 ≥ 0, 𝑠𝑠− ≥ 0, 𝑠𝑠+ ≥ 0

where, 𝑒𝑒 is a vector of ones whose dimension matches its multiplier.

A DMU achieves the Pareto-Koopmans or ‘strong’ efficiency if the optimal solutions of

Phase 1 and Phase 2 satisfy:

(i) 𝜃𝜃∗ = 1 which is referred to as radial efficiency, technical efficiency, or weak

efficiency

(ii) 𝑠𝑠+ = 0 and 𝑠𝑠− = 0

If 𝜃𝜃 is equal to 1 with non-zero slacks, this indicates that all the inputs are reduced radially

from the origin with the same proportion; however, some inputs still show excesses which is

referred to as ‘mix inefficiency’. Accounting for these excesses changes the proportions of the

inputs.



60

Based on the preceding discussions, to improve the performance of the DUM under

scrutiny (DMU𝑜𝑜) using an input-oriented model, first the inputs are minimized radially and

then the input and output slacks are accounted for. Therefore, the projection of DMU𝑜𝑜 on the

frontier is represented by (𝑥𝑥�𝑜𝑜 ,𝑦𝑦�𝑜𝑜), where

𝑥𝑥�𝑜𝑜 = 𝜃𝜃∗𝑥𝑥𝑜𝑜 − 𝑠𝑠−∗ = 𝑋𝑋𝜆𝜆 (4.16)

𝑦𝑦�𝑜𝑜 = 𝑠𝑠+∗ + 𝑦𝑦𝑜𝑜 = 𝑌𝑌𝜆𝜆 (4.17)

Equations (4.16) and (4.17) imply that both the input and output coordinates of the

improved DMU can be expressed as the weighted sum of the respective inputs and outputs of

the DMUs in the reference set. Therefore, 𝜆𝜆 indeed represents the weights assigned to the

DMUs in the reference set.

As indicated, the CCR model can be output oriented which aims to maximize outputs

without increasing the input levels. This model can be derived in the similar fashion to the

CCR input-oriented model described above. More details can be found in Charnes et al. (1978)

[36] or Cooper et al (2007) [41].

4.2.2. BCC Model

As indicated, the CCR model assumes constant returns to scale. The assumption of constant

returns to scale may not hold in practice due to the existing limitations in the production

process in reality. For instance, if a factory owns limited land and facilities, increasing inputs

cannot be translated to increasing outputs indefinitely. To achieve a more realistic production

frontier, Banker, Charnes, and Cooper (Banker et al.,1984 [15]) posit a model whose production

frontier is a piecewise linear convex hull of the given DMUs and exhibits the properties of

variable returns to scale. The only difference between the BCC and CCR models is that the



61

BCC model adds the following equation as an extra constraint to Phase 1 of the dual problem

in the CCR model:

Due to this added constraint, the production frontier includes more efficient DMUs. Therefore,

a CCR-efficient DMU is expected to be BCC-efficient as well.

4.2.3. Additive Model

In using the CCR and BCC models, the analyst is required to select the orientation of the

model in advance. Additive models eliminate this requirement by combining the input and

output orientations. The envelopment form of the basic additive model is shown below:

max𝜆𝜆,𝑠𝑠+,𝑠𝑠−

𝑧𝑧 = 𝑒𝑒 𝑠𝑠+ + 𝑒𝑒 𝑠𝑠− (4.19)

Eq. 4.34 [41]

Subject to 𝑠𝑠− + 𝑋𝑋𝜆𝜆 = 𝑥𝑥𝑜𝑜 (4.20)

Eq. 4.35 [41]

𝑌𝑌𝜆𝜆 − 𝑠𝑠+ = 𝑦𝑦𝑜𝑜 (4.21)

Eq. 4.36 [41]

𝑒𝑒𝜆𝜆 = 1 (4.22)

Eq. 4.37 [41]

�𝜆𝜆𝑖𝑖

𝑛𝑛

𝑖𝑖=1

= 1, 𝜆𝜆𝑖𝑖 ≥ 0 for all 𝑗𝑗′s (4.18)



62

𝜆𝜆 ≥ 0, 𝑠𝑠− ≥ 0, 𝑠𝑠+ ≥ 0 (4.23)

Eq. 4.38 [41]

Based on Equation (4.22), the production possibility set is characterized by variable returns to

scale.

Under the additive model, a DMU is efficient if and only if 𝑠𝑠− = 0 and 𝑠𝑠+ = 0. The

coordinates of the improved version of the inefficient DMU under scrutiny (DMUo), denoted

by 𝑥𝑥�𝑜𝑜 and 𝑦𝑦�𝑜𝑜, can be expressed as follows:

𝑥𝑥�𝑜𝑜 = 𝑥𝑥𝑜𝑜 − 𝑠𝑠−∗ (4.24)

Eq. 4.44 [41]

𝑦𝑦�𝑜𝑜 = 𝑦𝑦𝑜𝑜 + 𝑠𝑠+∗ (4.25)

Eq. 4.45 [41]

The additive model accounts for the input and output inefficiencies directly, and the 𝑧𝑧 in

Equation (4.19) is an additive measure which demonstrates the combined input excesses and

output shortfalls. Despite the efficiency scores of CCR and BCC which only show the redial

efficiency, 𝑧𝑧 has the advantage of incorporating all the sources of inefficiency. It, however,

denotes an absolute measure and despite a relative measure such as 𝜃𝜃, does not reflect the

relative relationship between inputs and outputs.

4.2.4. Slacks-Based Measure of Efficiency

The additive model gauges efficiency in absolute measure which is not ‘dimension free’ or

‘unit invariant’ since changing the units of inputs and outputs impacts the outcomes. The



63

model shown in this section modifies the objective function (i.e., efficiency measure) of the

additive model so that it is unit invariant. The model was first suggested by Pastor et al. (1999)

[111], under the name ‘Enhanced Russell Graph Measure of Efficiency’. Shortly after, in an

independent study, Tone (2001) [136] developed the same model and referred to it as ‘slacks-

based measure of efficiency’ (SBM). Although the model is commonly designated by the

latter name, Enhanced Russell Graph Measure of Efficiency and SBM are indeed equivalent.

The efficiency measure of the model (𝜌𝜌) is defined as the ratio of mean ‘input mix

inefficiencies’ and mean ‘output mix inefficiencies’. It is a real number within the range [0,1]

(i.e., 𝜌𝜌 ∈ [0,1]) and can be demonstrated as follows:

min 𝜌𝜌 =𝜆𝜆, 𝑠𝑠−,𝑠𝑠+

�1 −1𝑚𝑚�

𝑠𝑠𝑖𝑖−

𝑥𝑥𝑖𝑖𝑜𝑜

𝑚𝑚

𝑖𝑖=1

� /�1 +1𝑠𝑠�

𝑠𝑠𝑟𝑟+

𝑦𝑦𝑟𝑟𝑜𝑜

𝑠𝑠

𝑟𝑟=1

�

= �1𝑚𝑚�

𝑥𝑥𝑖𝑖𝑜𝑜 − 𝑠𝑠𝑖𝑖−


𝑚𝑚

𝑖𝑖=1

� /�1𝑠𝑠�

𝑦𝑦𝑟𝑟𝑜𝑜 + 𝑠𝑠𝑟𝑟+


𝑠𝑠

𝑟𝑟=1

�

(4.26)

Eq. 4.48 [41]


Eq. 4.48 [41]


Eq. 4.48 [41]

𝜆𝜆 ≥ 0, 𝑠𝑠− ≥ 0, 𝑠𝑠+ ≥ 0 (4.29)

Eq. 4.48 [41]

Under the SBM model, a DMU is efficient if 𝜌𝜌 = 1 which implies zero slacks. Based on

Equations (4.27) to (4.29), the coordinates of the improved activity on the efficient frontier

remain the same as those of the additive model (i.e., Equations (4.24) and (4.25)).


4.3. Negative Inputs or Outputs

64


The discussion of the DEA models presented so far assumes semipositive data. However, as

is the case in this thesis, the vectors of inputs or outputs may consist of negative data or they

may comprise a mix of negative and positive numbers. As a case in point, consider profit; it is

often treated as an output since more of the variable is desired. Profit can become negative for

the respective period, and there is, therefore, a need for a method which can handle such cases.

The theoretical discussion concerning the topic of negative data in DEA can be considered

to have commenced in mid-1990s, with the work of Lovell and Pastor (1995) [95] and Pastor

(1996) [108]. Since then, a broad range of research projects has emerged, proposing various

techniques to tackle this aspect of DEA. The reader is encouraged to consult Pastor and Ruiz

(2007) [110] and Pastor and Aparicio (2015) [109] for a comprehensive survey on the

advancement of this crucial line of DEA research up until then. This section is devoted to

introducing and reviewing the existing seminal methodologies, in light of the specific

objectives pursued by this thesis. Later, in Section 6.2, it is presented how these methods are

tailored to suit the characteristics of the problem addressed in this document.

4.3.1. Translating Data

One solution to deal with negative inputs and/or outputs that may be encountered in

practice is ‘translating’ them to positive values. In DEA, shifting the inputs and/or outputs by

changing the origin (i.e., translating) results in a new problem. A DEA model is ‘translation

invariant’ if the optimal solution of the new problem remains the same as that of the original

problem (Cooper et al., 2007 [41]). Such model is also called ‘solution invariant’ (Seiford and

Zhu, 2002 [130]).

Within the DEA models introduced so far, the input (output)-oriented BCC model is

translation invariant if only outputs (inputs) are shifted. Another model that can be translation

invariant is the basic additive model which has the convexity constraint of 𝑒𝑒𝜆𝜆 = 1 (Ali and



65

Seiford, 1990 [5]). The constraint of 𝑒𝑒𝜆𝜆 = 1 is required since only in its presence do the

constraints of the additive model remain unaffected by the shift in the origin. The additive

model is translation invariant with respect to both inputs and outputs. However, it should be

recalled that the efficiency measure yielded by this model is an absolute and unit-sensitive

number. Lacking a unit-invariant efficiency measure with the capacity to show the relative

relationship between inputs and outputs makes this model incompatible with the objectives of

many studies.

Lovell and Pastor (1995) [95] proposed a model which overcomes the unit dependency

problem: normalized weighted additive model. The model was the first DEA model that

integrated both translation and unit invariance. The constraints of the normalized weighted

additive model are identical to the conventional additive model, introduced in Section 4.2.3;

however, the objective function is maximizing the weighted sum of slacks. For each input or

output slack, the weight is set to be the inverse of the sample standard deviation of the

respective input or output. It can immediately be perceived that the objective function is unit

invariant. It is translation invariant, as well since the standard deviation of a sample would not

be affected by translating the data.

Although the normalized weighted additive model of Lovell and Pastor (1995) [95] is both

translation and unit invariant, it suffers from the following shortcoming: It is hard to make

sense of the efficiency measure generated by the model. The figure of weighted sum of slacks,

with weights being assigned as indicated above, does not clarify the relationship between the

outputs and inputs; nor does it provide a direct implication for improving the performance of

an inefficient DMU or determining its distance from the efficiency frontier.

A variety of approaches have been used in the literature to transform data in order to make

them compatible with the existing standard DEA models. As an example, in order to eliminate

the problem of negative data, Lovell (1995) [94] mapped his output indicators onto the scale of

0 to 100. It should be noted that one cannot arbitrarily change the origin of the numbers since

the final outcomes would be sensitive to the amount of the transformation. This point is



66

illustrated below using the standard input-oriented BCC model, introduced in Section 4.2.2.

As indicated before, this model is not translation invariant with respect to inputs. Suppose, the

inputs and outputs are translated by adding the positive constants 𝜏𝜏 and 𝜉𝜉, respectively. These

constants are sufficiently large to turn all negative numbers into positive numbers. If the

untranslated input and output matrices are respectively denoted by 𝑋𝑋𝑂𝑂𝑂𝑂𝑂𝑂 and 𝑌𝑌𝑂𝑂𝑂𝑂𝑂𝑂, and their

translated counterparts are represented by 𝑋𝑋𝑁𝑁𝑁𝑁𝑁𝑁 and 𝑌𝑌𝑁𝑁𝑁𝑁𝑁𝑁, the latter can be expressed in terms

of the former as shown below:

𝑋𝑋𝑁𝑁𝑁𝑁𝑁𝑁 = 𝑋𝑋𝑂𝑂𝑂𝑂𝑂𝑂 + 𝜏𝜏 (4.30)

𝑌𝑌𝑁𝑁𝑁𝑁𝑁𝑁 = 𝑌𝑌𝑂𝑂𝑂𝑂𝑂𝑂 + 𝜉𝜉 (4.31)

Replacing 𝑋𝑋𝑂𝑂𝑂𝑂𝑂𝑂 and 𝑌𝑌𝑂𝑂𝑂𝑂𝑂𝑂 in the BCC model with the transformed data (i.e., 𝑋𝑋𝑁𝑁𝑁𝑁𝑁𝑁 and

𝑌𝑌𝑁𝑁𝑁𝑁𝑁𝑁 shown by Equations (4.30) and (4.31)) leads to the following equations:

min𝜃𝜃,λ

𝜃𝜃 (4.32)

Subject to 𝜃𝜃𝑥𝑥𝑜𝑜𝑁𝑁𝑁𝑁𝑁𝑁 − 𝜏𝜏𝜃𝜃 − 𝑋𝑋𝑁𝑁𝑁𝑁𝑁𝑁𝜆𝜆 + 𝜏𝜏𝑒𝑒𝜆𝜆 ≥ 0 (4.33)

𝑌𝑌𝑁𝑁𝑁𝑁𝑁𝑁𝜆𝜆 − 𝜉𝜉𝑒𝑒𝜆𝜆 ≥ 𝑦𝑦𝑜𝑜𝑁𝑁𝑁𝑁𝑁𝑁 − 𝜉𝜉 (4.34)

𝑒𝑒𝜆𝜆 = 1 (4.35)

𝜆𝜆 ≥ 0 (4.36)



67

Applying the convexity constraint (i.e., 𝑒𝑒𝜆𝜆 = 1), the model evolves into the following form:

min𝜃𝜃,λ

𝜃𝜃 (4.37)

Subject to 𝜃𝜃𝑥𝑥𝑜𝑜𝑁𝑁𝑁𝑁𝑁𝑁 − 𝑋𝑋𝑁𝑁𝑁𝑁𝑁𝑁𝜆𝜆 + 𝜏𝜏(1 − 𝜃𝜃) ≥ 0 (4.38)

𝑌𝑌𝑁𝑁𝑁𝑁𝑁𝑁𝜆𝜆 ≥ 𝑦𝑦𝑜𝑜𝑁𝑁𝑁𝑁𝑁𝑁 (4.39)

𝑒𝑒𝜆𝜆 = 1 (4.40)

𝜆𝜆 ≥ 0 (4.41)

This model is not solution invariant since despite the rest of the equations, Equation (4.38)

has not preserved its old format (i.e., Equation (4.8)); the data transformation has entered one

extra term into this equation (i.e., 𝜏𝜏(1 − 𝜃𝜃)) which changes the optimization results. In other

words, the 𝜆𝜆 and 𝜃𝜃 values do not remain intact as a result of changing the origin or translating

the data. Furthermore, Equation (4.38) shows that the final outcomes would be sensitive to the

size of the transformation (i.e., 𝜏𝜏 in this case).

4.3.2. Treating Negative Outputs (Inputs) as Positive Inputs

(Outputs)

Another possible, and perhaps the simplest, approach is to treat negative outputs as positive

inputs, and vice versa. This approach emerged as a possible solution for dealing with the cases

of undesirable outputs or desirable inputs (e.g., Koopman, 1951 [88]; Scheel, 2001 [126]; Seiford

and Zhu, 2002 [130]). In such cases, the efficient units are expected to achieve lower levels of

undesirable outputs like the amount of carbon dioxide produced by a process or higher levels

of desirable inputs, such as the magnitude of toxic chemicals fed into a filter for treatment.

Such settings contrast with the conventional DEA framework where the true production



68

process demands the outputs to be maximized and the inputs to be minimized. One approach

to address this problem is to design the DEA model such that it presents the undesirable

outputs as inputs. Similarly, the desirable inputs can be treated as outputs. The production

possibility set of this DEA model would be identical to that of the case which keeps the

negative equivalents of the undesirable outputs on the output side or similarly, continues to use

the desirable inputs as inputs but with negative signs (Scheel, 2001 [126]).

Drawing upon the above-described approach, one may treat negative outputs as positive

inputs, and vice versa. If a vector of input or output consists of a mix of positive and negative

numbers, this approach would require dividing the respective vector into two sub-vectors. One

of these sub-vectors would hold the positive numbers and fill in the place of negative values

with zero (or a very small positive value). Following the same format, the other one would

keep the absolute values of the negative elements and substitute zero (or a very small positive

value) in the place of positive numbers. The context would then dictate which sub-vector

needs to be maximized (minimized), and hence, should reside on the output (input) side.

This approach may be considered adequate if the entire vector of an input or output is

consistently negative. However, in cases where the vector comprises a mix of negative and

positive numbers, this approach would not be as reliant. The approach suffers from significant

theoretical shortcomings which are detailed in Section 6.2.3.1.

4.3.3. Semi-Oriented Radial Measure

The model of semi-oriented radial measure (SORM) suggested by Emrouznejad et al.

(2010) [49] is closely related to the approach discussed in the previous section which will be

referred to as ‘N-to-P’, henceforth, for the sake of brevity. In an identical manner to the N-to-P

approach, SORM splits the vector of input or output - which contains a mix of positive and

negative numbers - into two sub-vectors. The SORM model, however, handles the constraints

somewhat differently. The difference can best be explained using the mathematical



69

demonstration of the model. Below, the input-oriented SORM model is presented which

appears as ‘Model 8’ in the work of Emrouznejad et al. (2010) ) [49]:

Input-Oriented SORM Model (VRS)

min𝜃𝜃,λ

𝜃𝜃 (4.42)

Model 8 [49]

Subject to 𝑋𝑋+𝜆𝜆 ≤ 𝜃𝜃𝑥𝑥𝑜𝑜+ (4.43)

𝑋𝑋−1𝜆𝜆 ≤ 𝜃𝜃𝑥𝑥𝑜𝑜−1 (4.44)

𝑋𝑋−2𝜆𝜆 ≥ 𝜃𝜃𝑥𝑥𝑜𝑜−2 (4.45)

𝑌𝑌+𝜆𝜆 ≥ 𝑦𝑦𝑜𝑜+ (4.46)

𝑌𝑌−1𝜆𝜆 ≥ 𝑦𝑦𝑜𝑜−1 (4.47)

𝑌𝑌−2𝜆𝜆 ≤ 𝑦𝑦𝑜𝑜−2 (4.48)

𝑒𝑒𝜆𝜆 = 1 (4.49)

𝜆𝜆 ≥ 0 (4.50)

Note that the notation of the original model is modified to ensure that it agrees with the

notation of the previously introduced DEA models. The reader is referred to the original paper

for the output-oriented SORM model and further details.

Above, 𝑋𝑋− denotes a 𝑙𝑙 × 𝑛𝑛 matrix of inputs. The constant 𝑛𝑛 stands for the number of

DMUs and 𝑙𝑙 represents the number of inputs which contain a mix of positive and negative

values. Also, suppose 𝑋𝑋+ is a 𝑚𝑚 × 𝑛𝑛 matrix identical to matrix 𝑋𝑋 shown in Equation (4.11)



70

which only includes semi-positive inputs with 𝑚𝑚 representing the number of positive inputs.

The vector of 𝑥𝑥𝑜𝑜− (𝑥𝑥𝑜𝑜+) with the size of 𝑙𝑙 × 1 (𝑚𝑚 ×1) comprises the inputs corresponding to

DMUo selected from 𝑋𝑋− (𝑋𝑋+).

The matrices 𝑌𝑌− and 𝑌𝑌+, with the dimensions of 𝑘𝑘 × 𝑛𝑛 and 𝑠𝑠 × 𝑛𝑛, respectively, denote

equivalents of 𝑋𝑋− and 𝑋𝑋+ for the outputs. Similarly, the vectors of 𝑦𝑦𝑜𝑜− and 𝑦𝑦𝑜𝑜+, with the

lengths of 𝑘𝑘 and 𝑠𝑠, can be considered as the counterparts of 𝑥𝑥𝑜𝑜− and 𝑥𝑥𝑜𝑜+, shown above.

The matrix 𝑋𝑋−1 is constructed by keeping all the positive values in 𝑋𝑋− and substituting the

negative numbers with zeros. The matrix 𝑋𝑋−2, on the other hand, is formed by replacing all

the negative values in 𝑋𝑋− with the corresponding absolute values and converting the rest of the

indices to zero. The vectors of 𝑥𝑥𝑜𝑜−1 and 𝑥𝑥𝑜𝑜−

2 represent the columns corresponding to DMUo

in the matrices of 𝑋𝑋−1 and 𝑋𝑋−2, respectively. The matrices of 𝑌𝑌−1 and 𝑌𝑌−2 and the vectors of

𝑦𝑦𝑜𝑜−1 and 𝑦𝑦𝑜𝑜−

2 can be likewise interpreted for the outputs.

The rest of the notation remains the same as those presented under ‘Basic DEA Models’ in

the onset of this Chapter (see, Section 4.2).

It is readily observable that this model resembles the N-to-P approach which allocates

negative and positive numbers to individual sub-vectors (see, Section 4.3.2). There, however,

exists one important difference. In the N-to-P approach, once the sub-vectors are formed, the

problem is transformed to a new one where the vector of mixed negative and positive values is

omitted and is replaced with two new sub-vectors. The problem essentially gains new inputs

and outputs which are treated as ordinary inputs and outputs with no reference to their origins.

It is important to settle the fact that when the N-to-P approach is used, from the mathematical

point of view, any DEA model would not distinguish between the new sub-vector created on

the input side and the all-positive input vectors which were originally resided on the input

side. The same point holds for the outputs. In the SORM model, on the other hand, the sub-



71

vectors are not simply treated as ordinary semi-positive inputs or outputs. If this were the case,

Equation (4.45) would have the following format:

𝑋𝑋−2𝜆𝜆 ≥ 𝑥𝑥𝑜𝑜−2 (4.51)

as opposed to

𝑋𝑋−2𝜆𝜆 ≥ 𝜃𝜃𝑥𝑥𝑜𝑜−2 (4.45)

Equation (4.45) implies that the SORM model chooses to radially contract the absolute value

of a negative input and then, it searches for a target which is superior to this figure. The same

argument applies to the negative outputs in the output-oriented SORM.

Cheng et al. (2013) [38] have observed cases where the SORM model sets worse targets for

the inefficient units. The point just illustrated may be one cause for this matter. Equation

(4.45) indicates that in case of negative inputs where the model aims to maximize the absolute

values, the model may identify a target whose absolute value is greater than the contracted

value of the corresponding input of DMUo; yet, it is possible that this target does not exceed

the uncontracted value of the input. In other words, while Equation (4.45) holds, Equation

(4.51) may not always be true. When Equation (4.45) is not satisfied, the absolute value of the

target’s respective input would be less than the absolute value of the corresponding input used

by DMUo which implies that the target indeed consumes more input.

Considering the close resemblance of the SORM model to the N-to-P approach, it

incorporates all the shortcomings imposed by disaggregating the original input or output

vector. These limitations are discussed in details in Section 6.2.3.1.



72

4.3.4. Variant of Radial Measure

Put forth by Cheng et al. (2013) [38], the variant of the radial measure (VRM) indeed extends

the mathematics of conventional radial DEA models (i.e., CCR and BCC) so that they can

handle negative data, as well. The core of the model is to radially contract (expand) the

absolute values of the inputs (outputs) in order to measure the absolute value of the respective

possible improvement (i.e., input decrease or output increase). This would prevent the model

from identifying worse targets as may be the case with the methodology developed by

Emrouznejad et al. (2010) [49] (see, Section 4.3.3). It should be noted that the model proposed

by Cheng et al. (2013) [38] bears a close resemblance to the ‘generalised proportional distance

function’ suggested by Kerstens and De Woestyne (2011) [84]. The two studies, which seem to

be conducted independently and almost concurrently, have arrived at the same solution to the

following problem: How can the traditional radial DEA models be modified to accommodate

negative values?

In order to clarify the specific contribution of the model, this thesis presents the input-

oriented version of the model which satisfies the VRS condition. The reader is referred to the

original paper for the output-oriented model. Cheng et al. (2013) [38] substitutes the efficiency

score of BCC (i.e., 𝜃𝜃) with (1 − 𝛽𝛽), where 𝛽𝛽 can be considered a measure of inefficiency.

Such modification restructures the BCC model into the following form (Equation 7 in Cheng

et al., 2013 [38]):

max𝛽𝛽,λ

𝛽𝛽 (4.52)

Eq. 7 [38]

Subject to 𝛽𝛽𝑥𝑥𝑜𝑜 + 𝑋𝑋𝜆𝜆 ≤ 𝑥𝑥𝑜𝑜 (4.53)




73

𝑒𝑒𝜆𝜆 = 1 (4.55)

𝜆𝜆 ≥ 0 (4.56)

As indicated, in the input-oriented model of Cheng et al. (2013) [38], the absolute values of the

inputs are radially reduced. Therefore, Equation (4.53), shown above, is replaced with the

following equation:

𝛽𝛽 �𝑥𝑥𝑜𝑜� + 𝑋𝑋𝜆𝜆 ≤ 𝑥𝑥𝑜𝑜

In its complete form, the model can be demonstrated as below:

Input-Oriented VRM Model (VRS)

max𝛽𝛽,λ

𝛽𝛽 (4.57)

Eq. 9 [38]

Subject to 𝛽𝛽 �𝑥𝑥𝑜𝑜� + 𝑋𝑋𝜆𝜆 ≤ 𝑥𝑥𝑜𝑜 (4.58)


𝑒𝑒𝜆𝜆 = 1 (4.60)

𝜆𝜆 ≥ 0 (4.61)

In the case of positive inputs, this model is identical to the traditional BCC model. For

negative inputs, Equation (4.58) can be written as follows:



74

−𝛽𝛽𝑥𝑥𝑜𝑜 + 𝑋𝑋𝜆𝜆 ≤ 𝑥𝑥𝑜𝑜

→ 𝑋𝑋𝜆𝜆 ≤ (1 + 𝛽𝛽)𝑥𝑥𝑜𝑜 (4.62)

Bearing in mind that Equation (4.62) concerns negative numbers, it is readily observable that

if the target’s respective input (i.e., 𝑋𝑋𝜆𝜆) is smaller than a greater-than-one multiple of the

DMUo’s corresponding negative input (i.e., (1 + 𝛽𝛽)𝑥𝑥𝑜𝑜), it cannot exceed 𝑥𝑥𝑜𝑜, either.

With a similar argument, it can be proven that the output-oriented model always identifies

better-performing targets and it can, therefore, be concluded that the VRM model would

always lead to efficient targets with superior performance.

Although the VRM model is capable of providing correct directions for improving DMUo’s

performance, it suffers from a significant shortcoming: the model does not comply with the

‘monotonicity’ rule; that is, the efficiency score of a DMU is expected to monotonically

decrease with its distance from the corresponding target on the efficient frontier. Under the

VRM approach, it is possible that a DMU requiring a greater improvement to reach its

efficient target is identified as more efficient than a DMU which was located closer to its

efficient target. It should be noted that this issue only arises in the cases where input or output

vectors contain mixed signs. Otherwise, if input or output vectors are purely positive or

negative, the ‘monotonicity’ rule holds.

4.3.5. Range Directional Models

The range directional models and the SORM (Emrouznejad et al., 2010) [49] and VRM

(Cheng et al., 2013) [38] models, introduced above, can be considered two different lines of

methodologies which deal with negative data. The former (i.e., range directional models)

precede the latter (i.e., SORM and VRM models) in the timeline of the approaches developed

to handle negative data in DEA. Yet, they do not encompass the shortcomings associated with

the SORM and VRM models (see, Sections 4.3.3 and 4.3.4). In the cases of SORM and VRM,

researchers indeed aimed to pioneer a different line of methodologies which are capable of



75

coping with negative data, and yet, do not require the change of origin (as is the case with

directional distance models). However, further research is required to overcome the limitations

of SORM and VRM and to make them applicable to a broader range of situations. The range

directional models, on the other hand, offer a more comprehensive framework to base the

analysis on and to expand upon.

Generic directional distance functions were first studied by the economists in the following

research: Chung et al. (1997) [39] and Chambers et al., (1996 and 1998) [34,35]. Motivated by

these directional distance functions, Silva Portela et al. (2004) [132] developed a range

directional model to address cases where some data may be negative within the DEA context.

Their range directional model, referred to as RDM, is unit-invariant, and it can

straightforwardly be proven that if the convexity assumption holds, it is translation-invariant,

as well. The model is presented below. Except where noted, the notation is consistent with the

basic DEA models, introduced in Section 4.2.

A Non-Oriented Range Directional Model

max𝛽𝛽,λ

𝛽𝛽 (4.63)

Eq. 1 [132]

Subject to 𝑋𝑋𝜆𝜆 + 𝛽𝛽𝑔𝑔𝑥𝑥𝑜𝑜 ≤ 𝑥𝑥𝑜𝑜 (4.64)

𝑌𝑌𝜆𝜆 − 𝛽𝛽𝑔𝑔𝑦𝑦𝑜𝑜 ≥ 𝑦𝑦𝑜𝑜 (4.65)

𝑒𝑒𝜆𝜆 = 1 (4.66)

𝜆𝜆 ≥ 0 (4.67)



76

Silva Portela et al. (2004) [132] refer to the vectors of 𝑔𝑔𝑥𝑥𝑜𝑜 and 𝑔𝑔𝑦𝑦𝑜𝑜 as the ‘ranges of possible

improvement’ for DMUo. Each member of these vectors is defined as follows:

𝑔𝑔𝑥𝑥𝑖𝑖𝑜𝑜 = 𝑥𝑥𝑖𝑖𝑜𝑜 − min𝐽𝐽�𝑥𝑥𝑖𝑖𝑟𝑟�

(4.68)

𝑔𝑔𝑦𝑦𝑟𝑟𝑜𝑜 = max𝐽𝐽

�𝑦𝑦𝑟𝑟𝑟𝑟� − 𝑦𝑦𝑟𝑟𝑜𝑜 (4.69)

where, 𝑟𝑟 = 1, … , 𝑠𝑠

𝑖𝑖 = 1, … ,𝑚𝑚 𝑗𝑗 = 1, … , 𝐽𝐽

Consistent with the basic DEA models (see, Section 4.2), here, 𝑚𝑚 and 𝑠𝑠 denote the counts of

inputs and outputs, respectively. The number of the DMUs participating in the analysis is

represented by 𝐽𝐽.

In non-mathematical terms, the range for each specific input (output) of DMUo is defined as

the absolute value of the difference between that specific input (output) and the minimum

(maximum) value observed across all the DMUs for the respective input (output). The range

indeed shows the maximum improvement that DMUo can achieve. If for any input (output),

DMUo exhibits the minimum (maximum) value across all the DMUs, the respective range

would be zero for DMUo.

As can be inferred from the equations shown above, 𝛽𝛽 represents the fraction of the

maximum possible slack that is attained by DMUo at its optimum state. It can, therefore, be

considered an inefficiency score, and consequently, the RDM efficiency score is determined

by ‘1 − 𝛽𝛽’. To make even more sense, note the similarity between this model and the

reconfigured BCC model (Equations (4.52) to (4.56)) if 𝑔𝑔𝑥𝑥𝑜𝑜 and 𝑔𝑔𝑦𝑦𝑜𝑜 are respectively

substituted with the corresponding observed inputs and outputs of DMUo (i.e., 𝑥𝑥𝑜𝑜 and 𝑦𝑦𝑜𝑜). It



77

should be noted that the non-oriented model presented above can be converted to an input-

oriented (output-oriented) model by setting the coefficient of 𝛽𝛽 to ‘0’ in Equation (4.65)

(Equation (4.64)).

In short, the RDM model is a unit- and translation-invariant radial model with the

efficiency score in the range of 0 to 1. The essence of the RDM model is that it chooses the

intersection of the minimums of all the inputs and the maximums of all the outputs (called the

‘ideal point’) as the origin instead of the origin of ‘0’. To elaborate more, consider a case with

one input and one output where the minimum observed input and the maximum observed

output are respectively -2 and 10, across all the DMUs. The RDM model is designed such that

the point (-2,10), as opposed to (0,0), is considered as the reference of the efficiency

calculations.

Similar to other traditional radial DEA models like BCC and CCR, the radial efficiency

score yielded by RDM does not incorporate all sources of inefficiency. In other words, the

efficiency score of ‘1’ does not automatically translate into the Pareto-Koopmans or ‘strong’

efficiency (see, Section 4.2). This shortcoming sparks the idea of developing a purely slacks-

based measure of efficiency for RDM. The slacks-based model could be considered as the

equivalent of SBM (Tone, 2001 [136]) in the traditional DEA context. It would, therefore, offer

similar advantages over RDM as does SBM over BCC or CCR. Sharp et al. (2007) [131] has

addressed this gap by proposing the ‘modified slacks-based measure’ (MSBM).

The MSBM model differs from the SBM model in the efficiency score or objective

function. In the former’s efficiency score, slacks are scaled using ‘ranges of possible

improvement’ (i.e., 𝑔𝑔𝑥𝑥𝑖𝑖𝑜𝑜 and 𝑔𝑔𝑦𝑦𝑟𝑟𝑜𝑜); whereas, in the efficiency score of the latter model, slacks

are scaled using the realized value of the respective input or output of DMUo. The efficiency

score of MSBM diverges from that of SBM in one more aspect: the efficiency score yielded

by the conventional SBM assigns equal weights to the slacks; whereas, MSBM has the

capacity to incorporate user-defined weights into the efficiency score. The following equations

clarify the mathematical structure of MSBM:



78

Modified Slacks-Based Measure


�1 − ∑ 𝜔𝜔𝑖𝑖𝑠𝑠𝑖𝑖−𝑔𝑔𝑥𝑥𝑖𝑖𝑜𝑜

𝑚𝑚𝑖𝑖=1 �

�1 + ∑ 𝜗𝜗𝑟𝑟𝑠𝑠𝑟𝑟+𝑔𝑔𝑦𝑦𝑟𝑟𝑜𝑜

𝑠𝑠𝑟𝑟=1 �

(4.70)

Eq. 6 [131]



𝑒𝑒𝜆𝜆 = 1 (4.73)

𝑒𝑒𝜔𝜔 = 1 (4.74)

𝑒𝑒𝜗𝜗 = 1 (4.75)

𝜆𝜆 ≥ 0, 𝑠𝑠− ≥ 0, 𝑠𝑠+ ≥ 0

𝜔𝜔 ≥ 0, 𝜗𝜗 ≥ 0 (4.76)

Here, the variables represent the same factors as the previous models. The coefficients 𝜔𝜔 and

𝜗𝜗 denote the user-fed weights associated with the slacks. Note that the ranges 𝑔𝑔𝑥𝑥𝑖𝑖𝑜𝑜 and 𝑔𝑔𝑦𝑦𝑟𝑟𝑜𝑜

remain identical to the RDM ranges defined above (see, Equations (4.68) and (4.69)).

It is worthy of note that the efficiency score of RDM (i.e., ‘1 − 𝛽𝛽’) is greater than or equal

to the efficiency score of MSBM (see, Eq. 8 in Sharp et al., 2007 [131]). In intuitive terms, since

the MSBM model accounts for all sources of inefficiency using slacks, it generally yields

lower efficiency scores.

As final remarks on the topic of the range directional models, it is worthwhile to point out

that similar to the conventional setting of DEA, where an additive model is considered a close

cousin of SBM, in the context of range (directional) models, the following additive model falls


4.4. Non-Discretionary Variables

79

close to MSBM in terms of the their capacity to fulfil the unit-invariance and translation-

invariance properties: Bounded Adjusted Measure of Efficiency or BAM (Cooper et al., 2011

[40]). It is a weighted additive model, where the objective function represents a weighted sum

of slacks. For a given input or output, the weight is set to be the inverse of the product of the

respective range of possible improvement and the total number of inputs and outputs. In

comparing BAM and MSBM, it should be noted that the figure of weighted sum of slacks in

BAM does not clarify the relationship between the outputs and inputs; nor does it provide a

direct implication for improving the performance of an inefficient DMU or determining its

distance from the efficiency frontier. However, these shortcomings can be considered

negligible in the scope of many projects and BAM can serve as an alternative to MSBM.

In conclusion, different approaches have been suggested to cope with negative inputs and

outputs in DEA. Considering the merits and limitations of each technique, discussed in details

above, the range directional models (i.e., RDM and MSBM) offer a more advanced framework

to handle negative data, particularly in the cases where the vector of an input or an output

comprises a mix of positive and negative values. To put into very brief terms, despite the other

DEA models presented in this section, RDM and MSBM hold the characteristic of translation

invariance while preserving the unit-invariance feature as well as the true production process.

Furthermore, the efficiency scores yielded by these models can readily be interpreted similar

to the efficiency scores of the conventional DEA models, such as BCC, CCR and SBM. The

reader is referred to the details presented above for a comprehensive discussion on the virtues

and flaws of the existing methods.


In the DEA research, there may exist variables which do impact the performance of

different DMUs; yet, they are ‘exogenously fixed’ (Cooper et al., 2007 [41]) and cannot be

varied at the discretion of the analyst or the management team. To contemplate the

performance analysis from all sides, such variables, referred to as non-discretionary (ND)



80

variables, should be taken into account. As a case in point, in terms of sales performance, a

supermarket with few competitors, located at a densely-populated residential region would be

at advantage compared to one of its kind based at a sparsely-populated residential area

competing with more rivals. In such cases, the performance should be assessed in light of

these exogenous variables which cannot be modified or improved but do impact the conduct

and outcome of the operations.

The conventional DEA research does not directly associate the ND variables with the

efficiency score; that is, in a slacks-based model, as an example, the slacks of the ND variables

do not appear in the objective function (i.e., the efficiency score). In a radial model, as another

example, the problem does not pursue the radial improvement of the ND variables. This

restriction is justified by the viewpoint that it would not eventually be possible to enhance the

ND variables based on the suggestions provided by the efficiency scores. The DEA research,

however, accounts for the impact of the ND variables by allowing them to act as constraints in

the optimization problem. The ND variables, therefore, restrict the production possibility set,

and consequently, modify the efficiency score indirectly via their presence in the constraints.

To make it applicable to the cases with ND factors, Banker and Morey (1986) [16] modified

the traditional two-phase BCC as follows:

Non-Discretionary Input-Oriented BCC

min 𝜃𝜃 − 𝜀𝜀 ��𝑠𝑠𝑖𝑖−

𝑖𝑖∈D

+ �𝑠𝑠𝑟𝑟+𝑠𝑠

𝑟𝑟=1

�

(4.77)

Model 7.1 [41]

or Eq. 19 [16]

Subject to 𝜃𝜃𝑥𝑥𝑖𝑖𝑜𝑜 = �𝑥𝑥𝑖𝑖𝑖𝑖𝜆𝜆𝑖𝑖

𝑛𝑛

𝑖𝑖=1

+ 𝑠𝑠𝑖𝑖−, 𝑖𝑖 ∈ D (4.78)



81

𝑥𝑥𝑖𝑖𝑜𝑜 = �𝑥𝑥𝑖𝑖𝑖𝑖𝜆𝜆𝑖𝑖

𝑛𝑛

𝑖𝑖=1

+ 𝑠𝑠𝑖𝑖−, 𝑖𝑖 ∈ ND (4.79)

𝑦𝑦𝑟𝑟𝑜𝑜 = �𝑦𝑦𝑟𝑟𝑖𝑖𝜆𝜆𝑖𝑖

𝑛𝑛

𝑖𝑖=1

− 𝑠𝑠𝑟𝑟+ (4.80)

�𝜆𝜆𝑖𝑖

𝑛𝑛

𝑖𝑖=1

= 1 (4.81)

𝜆𝜆𝑖𝑖 ≥ 0, 𝑗𝑗 = 1, … ,𝑛𝑛 (4.82)

𝑠𝑠𝑖𝑖− ≥ 0, 𝑖𝑖 = 1, … ,𝑚𝑚

𝑠𝑠𝑟𝑟+ ≥ 0, 𝑟𝑟 = 1, … , 𝑠𝑠

The model presented above is input-oriented. The output-oriented model with ND outputs

is readily attainable following the same notion. The model preserves the previous notation in

this document. The only difference is that the vector format has been expanded into

summation terms in order to better demonstrate the modifications applied to account for the

ND inputs. Moreover, the coefficient 𝜀𝜀 in the objective function indicates that slacks are

maximized in the second phase. To elaborate more, the first phase minimizes 𝜃𝜃 which is then

used as a constant in the second phase where the slacks are maximized. Note that the objective

function does not encompass the ND slacks. In addition, the problem does not allow the ND

inputs to be radially reduced.

With minor changes, Banker and Morey’s (1986) [16] model can be extended such that it

handles ND inputs and outputs simultaneously. To this end, Equation (4.77) can be modified

to only incorporate the slacks of discretionary outputs and Equation (4.80) can be split into

two equations, addressing the discretionary and non-discretionary outputs separately but

preserving the same format. Such extensions have also been documented by Golany and Roll



82

(1993) [61]. The authors also modified Banker and Morey’s (1986) [16] model so that it can

quantify ‘partial discretion’. The binary approach of either discretionary or non-discretionary

may not be applicable in all frameworks. The working hours is one example provided by the

authors. Managers can exercise some control over the number of hours. They are, however,

obligated to comply with the profession’s standards and occupational guidelines. To express

this principle in mathematical terms, Golany and Roll (1993) [61] restructured Equations (4.77)

to (4.82) and introduced two new variables which account for the level of discretion in inputs

and outputs. The two variables vary over the range of 0 to 1, with 1 and 0 respectively

assigned to fully discretionary and non-discretionary factors. Further details in this regard are

beyond the scope of this research and the reader is referred to the original study for further

details.

If the ND slacks in Banker and Morey’s (1986) [16] models are forced to be zero (i.e. the ND

constraints are ‘tight’), the model is then called a ‘non-controllable’ model; otherwise, it is

referred to as a ‘non-discretionary’ model. To put a ND slack of zero into perspective,

consider the objective function of the multiplier (i.e., dual) equivalent of Banker and Morey’s

(1986) [16] model:

max𝑣𝑣,𝑢𝑢

�𝑢𝑢𝑟𝑟𝑦𝑦𝑟𝑟𝑜𝑜

𝑠𝑠

𝑟𝑟=1

−� 𝑣𝑣𝑖𝑖𝑥𝑥𝑖𝑖𝑜𝑜𝑖𝑖∈ND

(4.83)

Model 7.7 [41]

Based on the dual theorem of the linear programming, 𝑠𝑠𝑖𝑖− = 0 where 𝑖𝑖 ∈ ND, translates into 𝑣𝑣𝑖𝑖

not equal to ‘0’ and implies that the objective function would be reduced by the amount of

𝑣𝑣𝑖𝑖𝑥𝑥𝑖𝑖𝑜𝑜. To explain in non-technical terms, a reduction in the ND input would lead to an

enhanced performance. A case in point could be excessive rainfall, causing a serious loss of

crops. The excessive rainfall is considered a non-discretionary input whose increase would be

destructive (Cooper et al., 2007 [41]). A non-discretionary slack can assume a non-zero



83

positive number, as well which implies that any excess in the respective ND input does not

cause any conflict with the performance.

Ruggiero (1996) [124] modified the models of Banker and Morey (1986) [16] in the following

manner: any DMU with more favourable non-discretionary inputs or outputs compared to

DMUo is excluded from the program of efficiency calculation for DMUo. This goal is achieved

by setting the 𝜆𝜆s of the corresponding DMUs to be zero. The objective of Ruggiero’s (1996)

[124] study - which was purely concerned with the calculation of technical efficiency - was to

build a model which sets the closest references for DMUo in terms of non-discretionary

factors. In other words, the study sought to identify, as efficient references, those DMUs that

operated under a similar or, perhaps, inferior environment. Ruggiero’s (1996) [124] approach is

supported by the viewpoint that it may not be feasible for DMUo to achieve the status of an

efficient target which operates under stronger exogenously fixed factors.

The efficiency estimation yielded by the methodology of Ruggiero (1996) [124], however,

deteriorates as the number of ND factors increases and the model tends to identify more

DMUs as efficient. Such sensitivity may be caused by the fact that with increasing number of

ND variables, more DMUs are possibly eliminated from the production possibility set which

makes DMUo more likely to be identified as efficient. Ruggiero (1998) [123] offered a three

stage DEA model to alleviate this shortcoming. Prior to unfolding that model, it should be

noted that the viewpoint pursued by Ruggiero (1996) [124] can also be achieved by the model of

Banker and Morey (1986) [16], if the ND constraints are ‘tight’ as explained above.

The first stage in the model of Ruggiero (1998) [123] involves using BCC with discretionary

factors to compute the efficiency score of the DMUs. In the second stage, the obtained

efficiency scores are linearly regressed on the exogenously fixed factors. This stage indeed

allocates a cross-sectional weight to each ND factor specifying its contribution to the yielded

efficiency scores in the first stage. For each DMU, the second stage finally computes a single

consolidated non-discretionary figure which is a weighted sum of all the ND factors with



84

weights as yielded by the regression. Only DMUs whose weighted sum is inferior to that of

DMUo would be preserved in the optimization program of DMUo.

Further contributions coping with ND factors in the framework of radial DEA models can

be observed in the works of Yang and Paradi (2006) [140] and Muñiz et al. (2006) [100]. In line

with the goals pursued in Ruggiero (1996 and 1998) [124,123], the foremost objective of these

studies is to compute a measure of efficiency which is best deprived of the impact of

exogenously fixed factors, and hence, reflects the true technical efficiency of the DMU of

interest. To achieve this goal, the methodologies depend on a series of assumptions. In the

model of Ruggiero (1998) [123], for example, regression is used to assign cross-sectional

weights to the ND factors. Therefore, it is assumed a priori that the ND factors are linearly

related to the efficiency score. Such assumptions expose the analysis to new sources of

uncertainty.

This thesis acknowledges the benefits of these studies when the context is solely concerned

with calculating technical efficiency and providing guidance on which aspects of the business

can be improved. However, the goal of these models does not align with the specific

objectives of this thesis mainly due to the following reason: each DMU can indeed be

represented by a point in the multi-dimensional space of discretionary and non-discretionary

factors and in this thesis, DEA seeks to map these points into a new space defined by the

coordinates of 𝜆𝜆 and efficiency score (see, Section 3.3 for more details). As will be detailed in

Chapter 6, the efficiency score and 𝜆𝜆s can be utilized to gauge ‘similarity’ among a group of

firms. In this context, the efficiency score is considered one coordinate of the comparison,

representing the distance between the efficiency frontier and the firm of interest, in terms of

the discretionary variables. Furthermore, with respect to the ND factors, the firm of interest is

only compared to the similarly positioned firms. This outcome is ensured by the ND

constraints. The models discussed above cannot preserve the integrity of discretionary and

non-discretionary factors in the transformation from the old space to the new space and fail to

simultaneously account for the effect of the ND variables on the values of 𝜆𝜆 yielded by the

optimization program.



85

In addition to the radial models, the slacks-based models have also been extended to cope

with the cases with ND inputs and outputs. Below, the additive model with this characteristic

is presented:

Additive Model with ND Factors

max 𝑒𝑒 𝑠𝑠+ + 𝑒𝑒 𝑠𝑠− (4.84)

Eq. 7.2 [41]



𝑠𝑠− ≤ 𝜋𝜋 ⊙ 𝑥𝑥𝑜𝑜 (4.87)

𝑠𝑠+ ≤ 𝛾𝛾 ⊙ 𝑦𝑦𝑜𝑜 (4.88)

𝜆𝜆 ≥ 0, 𝑠𝑠− ≥ 0, 𝑠𝑠+ ≥ 0 (4.89)

This model was originally developed by Charnes et al. (1987) [37] and similar to the model of

Golany and Roll (1993) [61], it accommodates ‘partial discretion’. Except for what is explained

below, the rest of the notation agrees with the previous models in this chapter.

The right-hand sides of Equations (4.87) and (4.88) represent the element-wise

multiplication of the respective two vectors. Each element of the 𝑚𝑚 × 1 vector 𝜋𝜋 can vary over

the range of 0 to 1, with 0 representing a fully non-discretionary input and 1 denoting a

discretionary input. The elements of the 𝑠𝑠 × 1 vector 𝛾𝛾 can adopt any value from the range of

[0,∞). The upper bound (i.e., 𝛾𝛾𝑟𝑟 → ∞) applies to the cases where the respective output is

discretionary and indeed implies that the corresponding fully discretionary slack is only

bounded to be non-negative. The lower bound, on the other hand, applies to ND outputs. In



86

both cases of inputs and outputs, any number between the upper and lower bounds would

represent ‘partial discretion’ which would be determined based on the degree of discretion that

can be exercised with respect to the corresponding input or output. Note that the model assigns

‘0’ to the ND slacks which can be justified using the same rationale provided for the case of

non-controllable radial model discussed above.

Farzipoor Saen (2005) [52] extended the SBM model (see, Section 4.2.4) to accommodate

non-discretionary inputs and outputs. His model essentially appends the constraints of the

model of Charnes et al. (1987) [37] (i.e., Equations (4.85) to (4.89)) to the list of constraints of

the standard SBM model. The following equations exhibit the outcome:

SBM Model with ND Factors


�1 −1𝑚𝑚�

𝑠𝑠𝑖𝑖−


𝑚𝑚

𝑖𝑖=1

� /�1 +1𝑠𝑠�

𝑠𝑠𝑟𝑟+


𝑠𝑠

𝑟𝑟=1

� (4.90)

Eq. 4.48 [41]



𝑠𝑠− ≤ 𝜋𝜋 ⊙ 𝑥𝑥𝑜𝑜 (4.93)

𝑠𝑠+ ≤ 𝛾𝛾 ⊙ 𝑦𝑦𝑜𝑜 (4.94)

𝜆𝜆 ≥ 0, 𝑠𝑠− ≥ 0, 𝑠𝑠+ ≥ 0 (4.95)

As with the additive model of Charnes et al. (1987) [37], the ND slacks adopt the value of ‘0’

in this model. Therefore, the objective function is not directly affected by the ND slacks.

However, note that 𝑚𝑚 and 𝑠𝑠 denote the total number of inputs and outputs, respectively. Thus,

the number of ND factors does impact the weights assigned to the discretionary slacks in the



87

objective function of Farzipoor Saen (2005) [52] and makes the efficiency score positively

biased. To correct for this bias, Equation (4.90) can be corrected by replacing 𝑚𝑚 and 𝑟𝑟 with the

corresponding numbers of discretionary inputs and outputs.

Based on the scrutiny of the sequence of the DEA models for the ND variables, it can be

concluded that the production process is as affected by the ND factors as it is by the

discretionary factors (Ray, 1988 [116]). It is, therefore, cogent to incorporate the ND constraints

into the DEA framework in a similar manner to the discretionary variables. Any model

addressing the impact of ND factors should, however, be designed such that it meets the

individual goals of the problem. As an example, if a radial model is used to provide managers

with some guidance on which aspects of the respective business can be improved, it cannot be

designed to radially improve the ND factors. The reason is that the ND variables are

exogenously fixed and cannot often be controlled by the management team. In this research,

the ND constraints will primarily be used for classification purposes and to ensure that the

firm of interest is compared to the firms operating in a worse or alike environment. This thesis

guarantees this outcome by focusing on the slacks of the ND constraints. Section 6.2.3.2

details the contribution of this thesis to the literature of the ND factors in DEA.

88


BAYESIAN INFERENCE IN RISK ASSESSMENT

5.1. Introduction

In 1986, J.P. Morgan [62], the world-renowned investment bank, brought forth a

revolutionary, yet practical and conceptually simple, measure in risk analysis – Value at Risk

(VaR). It is a single statistical figure and an explicit quantity that best summarizes the total

risk borne by a party holding a given portfolio. Since its emergence, VaR has remained a

popular risk measure by virtue of its simplicity and the ease of interpretation of its outcome

which can effortlessly be communicated to managers and stakeholders. Furthermore, the

authenticity of the measure contributes to its perpetuity. Its reliability is reflected in its

recognition as a crucial factor for determining the ‘capital adequacy’ of a bank or any other

financial institution. Regulatory bodies rely on the VaR analysis to decide on the amount of

equity per risk-weighted assets that must be held by a bank or financial institution to ensure

sufficient funds for safe operations and cushioning the hypothetical future problems.

In essence, VaR is a ‘quantile’ of a pdf representing gain or loss of an investment over a

given time period (e.g., between the present and a ‘target day’). It is worth noting that the

underlying pdf is not restricted to be a gain-loss one; since VaR is a quantile, its definition can

straightforwardly be extended to include other types of pdf as well. In layman’s terms, VaR

expresses the minimum loss, in absolute numbers, that an investment may suffer, with the

probability ƥ, over a given period of time (e.g., next N days). Phrased otherwise, with the

probability ƥ, in the next N days, the loss ascends to an amount equal to, or more than, a

threshold, referred to as VaR. As can be perceived, VaR is always accompanied by a

probabilistic statement about how likely it is for the investor to incur the loss of $VaR, or

more, in the next N days. Time horizon (or target date) and probability (or level of confidence)

are indeed two key explanatory variables that assemble to define VaR, for a given portfolio.

Chapter 5. Literature Review on Methodology: Bayesian Inference in Risk Assessment

5.1. Introduction

89

The confidence level is often chosen to be 99% or 95% in practice (i.e., ƥ = 1% or 5%,

respectively). The concept of VaR has been well preached in the literature and the following

studies fall among many which center on this topic: Dowd (1999) [47], Khindanova and Rachev

(2000) [85], Khindanova et al. (2001) [86], and Jorion (2006) [80].

The simplicity of interpretation and ‘intuitivity’ of the VaR approach should not be

mistaken for a lack of sophistication. Underlying any VaR statement is a pdf whose estimation

poses the major challenge for the approach. The literature of finance is abundant with a variety

of techniques for calculating VaR among which two approaches stand out as the prevalently

applied ones: historical simulation and model building (Siu et al., 2004 [134]). As suggested by

the terminology, the former approach is data-driven; that is, it relies on historical data to build

the empirical density required for the VaR estimation. The conspicuous shortcoming of this

method is its requirement of a long history to generate reliable and accurate outcomes. The

latter approach is a parametric one which concludes on the form of the gain-loss distribution a

priori. Given sufficiently simple initial assumptions, this method can lead to closed-form

formulae for VaR. A case in point would be the study conducted by Holton (1996) [68] which

rests on two assumptions: (i) gain-loss follows a normal distribution, (ii) the respective

portfolio is represented as a linear combination of its assets.

Since the gain-loss density is determined a priori in the model building approach of the

VaR analysis, it puts forth the challenge of how these models are developed and configured to

provide insight regarding the future scenarios of risk. Risk analysis is essentially a subjective

framework since it is impacted by how an analyst views and interprets uncertainties of the

financial market and how prior human beliefs influence the choice of the ‘right’ model

(Holton, 1997 [69]). The Bayesian methodology offers a formal means of incorporating

subjectivity and prior knowledge in the risk analysis. Concisely, in the Bayesian framework,

‘probability’ is interpreted as the subjective degree of belief in a proposition, which can be

contrasted against the rival ‘frequentist’ interpretation, expressing the probability of an event

as its long-run frequency of occurrence (see, Section 5.2 for more details on the two schools of

thought).


5.2. Bayesian versus Frequentist Statistics

90

As the rest of this chapter unfolds, the Bayesian framework provides risk analysts with the

flexibility of integrating their subjective probability judgments with data-based objective

inferences. The Bayesian algorithm ‘learns’ from new evidence and sequentially ‘updates’ the

prior probabilities as new data accumulate. The Bayesian philosophy is mostly praised for

incorporating an analyst’s degree of uncertainty about the model of interest and its parameters,

in addition to the uncertainties pertinent to the random fluctuations in the market – the only

type of uncertainty quantified in the classical setting. In general, the VaR risk analysis within

the Bayesian context involves the following steps:

(i) Identify risk factors.

(ii) Layout a model relating the portfolio value to the risk factors.

(iii) Design probabilistic models, such as probability densities, for the risk factors, and

refine (update) them in light of new information.

(iv) Compute the predictive density of the portfolio value by integrating (averaging)

over the values of the risk factors.

(v) Estimate the portfolio VaR based on the predictive density

These steps will become clearer as more insights are gleaned from the Bayesian

methodology. The remainder of this chapter acquaints the reader with practical nuts and bolts

as well as perils and promises of the Bayesian and frequentist perspectives. The Bayesian

framework hosts a vast array of topics which extend beyond the capacity of this thesis. The

choice of the topics thus follows the specific objectives of this thesis and aligns with the

application of the Bayesian framework in this document.


In the history of statistics, two main schools of thought have emerged regarding the

interpretation of probability. Although the roots of the Bayesian approach can be traced back

to the eighteenth century, the rejuvenation of the technique began in the second half of the

twentieth century due to advances in computational power. The debate of Bayesian vs.



91

frequentist has since spired into the war of ideas with the Bayesian approach recruiting more

and more adherents from among practitioners and academics. Some Bayesian stalwarts

advocate the approach as the only effective means for making probabilistic inferences and

projections and tackling ranges of risk dilemmas faced by decision makers (Malakoff, 1999

[96]; Jaynes and Bretthorst, 2003 [79]).

The frequentist or classical approach depends on the frequentist interpretation of

probability; that is, the probability of an event is expressed as the ‘limit’ of its frequency of

occurrence as the number of trials tends to infinity. As the definition suggests, repeated and

random trials is the framing context which encumbers the application of the approach to

‘infrequent’ events. The Bayesian approach, au contraire, adopts the subjective definition of

probability; that is, probability represents degree of belief which is a dynamic value subject to

continual changes in the data. The entity’s belief is ‘updated’ in view of the new information

that arrives.

Frequentists recognize probability as a concept applicable to repeatable experiments. Based

on the relative frequency interpretation, a statement like “the probability of observing ‘Head’

in tossing a fair (unbiased) coin is 0.5” implies that if the coin is flipped an unlimited (or a

sufficiently large) number of times, the frequency of ‘Head’ occurrences tends to 0.5. Such

elucidation is often a long way from how probability is perceived by practitioners and even the

experts in academia. The predominant interpretation is that upon tossing the coin, there is a

50% chance that the coin lands ‘Head’. Such perception is not accepted by frequentists since

they view probability as an intrinsic property of the respective event. Frequentists

determinedly avert from making statements about a single trial (i.e., a single toss of the coin,

in this case). The framework does not serve questions like “what is the probability of ‘Head’ in

a given flip”. Nonetheless, it offers the following answer to an assertive seeker: “In a given

trial, the coin will either fall ‘Head’ (with the probability of one) or ‘Tail’ (with the probability

of one) and the probability of observing a ‘Head’ remains constant from a trial to another,

being equal to one-half”.



92

The answer supplied by the Bayesian framework to the aforementioned question is more

intuitive and more consistent with scientific reasoning and the widespread cognition of

probability. The Bayesian approach interprets the question of “what is the probability of

‘Head’ in a given flip” as “what is the degree of belief in the coin landing ‘Head’, in a given

trial”. Assuming that the frequency of ‘Head’ occurrences was one-half in a past series of

experiments, the Bayesian framework asserts that “based on the current state of knowledge,

there is a 50% certainty that the coin turns ‘Head’ in the next trial”. Consequently, it emerges

that despite the frequentist persuasion, the probability of an event is not considered as its

intrinsic property, remaining intact through trials, but rather is viewed as a subjective

expression of uncertainty about future outcomes and can be revised in light of the new

information obtained from an extended experiment.

The distinction between the two rival interpretations can further be elaborated upon using

the following example, adopted from Ferson (2004) [55]. Frequentists apprehend probability as

a construct that is only applicable to events occurring in repeatable random processes. The

frequentist approach is, therefore, not inclined to address questions like “does God exist” or,

equivalently, “what is the probability of God’s existence” or “is she guilty of first-degree

murder”. The introduction provided above illuminates the logic behind this limitation: one

cannot design an experiment where the relative frequency of the existence of God is examined.

Nor can one conduct an experiment, consisting of a set of random trials, where she is

‘sometimes’ [55] guilty and ‘sometimes not' [55]. The Bayesian framework, on the other hand,

can easily accommodate such cases since the approach indeed gauges degrees of belief; how

certain one can be that God exists or she is guilty.

Of course, the differences between the two frameworks go beyond philosophical

arguments, and into pragmatic issues, which will be delved into following an introduction to

the Bayesian approach. There exist a plenty of literature offering broad conceptual aid with the

fundamentals of the methodology (e.g., Ferson, 2004 [55]; Rachev et al., 2008 [115]; Damien et

al., 2013 [44]). The reader is encouraged to pursue them for a comprehensive review of the


5.3. Basics of Bayesian Inference

93

approach. Furthermore, the author assumes familiarity with the fundamentals of classical

statistics on the part of the reader.


Derived based on the Baye’s theorem, the Bayesian approach is a formal representation of

the subjective view of uncertainty and is a principled way for describing the process of

scientific reasoning in conformity with the axioms of probability theory. It offers a profound

structure for inferring from data which is more consistent with the learning mechanism

adopted by the human brain, programmed to learn from experience. The brain has been hard-

wired to judge by drawing from the reservoir of interacting past experiences and updating

them in light of present ones. Such approach forms the core of the Bayesian methodology

where a ‘prior’ knowledge is updated by learning from the current information; the updated

prior knowledge is called the ‘posterior’ knowledge.

The Baye’s rule, which plays the central role in the Bayesian framework, in its most well-

known form is shown below:

𝑓𝑓Ζ(𝜁𝜁|ұ) =𝑓𝑓Ζ(𝜁𝜁) × 𝑓𝑓Ұ(ұ|𝜁𝜁)

𝑓𝑓Ұ(ұ) (5.1)

where, ұ represents a draw from the random variable Ұ, and Ζ denotes the parameter(s) of the

probability density function 𝑓𝑓Ұ(ұ). The principal idea is to derive inferences about the

parameter Ζ assuming that the characteristics of the sampled data are best described by 𝑓𝑓Ұ(ұ).

Note that Ζ itself is treated as a random variable contingent on a series of unknown

parameters, referred to as ‘hyperparameters’ (Rachev et al., 2008 [115]). The prior knowledge

regarding Ζ is summarized into the probability density function 𝑓𝑓Ζ(𝜁𝜁). It is, therefore, called

the ‘prior’ distribution of Ζ. The density function 𝑓𝑓Ζ(𝜁𝜁|ұ) denotes the updated version of the



94

prior density which is refined using the data (i.e., ұ1, ұ2, ұ3, … , ұ𝑛𝑛) and is termed the

‘posterior’ distribution of Ζ.

The density function 𝑓𝑓Ұ(ұ|𝜁𝜁) indeed represents 𝑓𝑓Ұ(ұ1, ұ2, ұ3, … , ұ𝑛𝑛|𝜁𝜁), where ұi (𝑖𝑖 =

1, … , 𝑛𝑛) designates the 𝑖𝑖th realization of the random variable Ұ. Therefore, the function

𝑓𝑓Ұ(ұ|𝜁𝜁) can be formulated as a function of 𝜁𝜁 since given all the 𝑛𝑛 samples, the only unknown

remains 𝜁𝜁. This new function is termed the ‘likelihood’ function which congregates all the

information held by the data into a single function.

In the Baye’s theorem, the greatest computational intricacy lies in calculating the

denominator in Equation (5.1). It is indeed a normalization factor and can be computed based

on the law of total probability presented below in its continuous form:

𝑓𝑓Ұ(ұ) = � 𝑓𝑓Ұ(ұ|𝜁𝜁) × 𝑓𝑓Ζ(𝜁𝜁)dζ𝜁𝜁

(5.2)

The preceding discussion clarified how the Baye’s rule integrates the prior information with

the accumulated evidence (i.e., data) to assess the overall uncertainty of an event. These two

sources of information are fused together into a posterior distribution and it is the relative

strength of each source that determines the final ‘location’ and ‘dispersion’ of the posterior

distribution. The Strength of a distribution is often gauged by its narrowness (so to speak)

since the focus of a narrow distribution remains on a shorter range of possible values, ruling

out a wider array of data. There is, therefore, a trade-off between the two sources of

knowledge and the posterior distribution inherits its characteristics mainly from the stronger

source (i.e., the most informative source).

As an extreme example, consider a case where the prior belief is represented in the form of

a Dirac delta function. It can mathematically be proven that regardless of the size of the new



95

evidence accrued, the posterior distribution would remain intact, resembling the shape of the

prior distribution. Such an outcome comes as no surprise since when an analyst expresses the

prior belief as a Dirac delta function, they explicitly communicate a great confidence in the

prior knowledge; hence, data cannot cast any additional light or doubt on such strong

preconception. Perhaps the most ardent debate across the pro-Bayesian and anti-Bayesian

statisticians revolves around prior distributions which are broadly classified as informative and

non-informative priors (O’Hagan, 2008 [102]).

A prior distribution is considered informative if it alters the form of the respective posterior

distribution; that is, in a case where an informative prior is utilized, the inference derived on

the basis of the posterior differs from the one deduced solely based on the data, due to the

influence exercised by the prior knowledge. When considerable prior evidence exists, the

Bayesian approach encourages the analysts to benefit from such prior resources in arriving at

effective solutions; however, in some circumstances, the analyst may not desire to perturb the

posterior distribution with the prior knowledge since (s)he is either skeptical of the prior

information itself or in doubt whether an appropriate prior density has been adopted to present

the prior knowledge (i.e., if the prior knowledge has been translated correctly into a relational

mathematical expression). Drawing upon Laplace’s ‘principle of insufficient reason’ (Ferson,

2004 [55]), such cases can be handled by non-informative priors also referred to as ‘vague’ or

‘diffuse’ priors (Rachev et al., 2008 [115]). They do not influence the corresponding posteriors

and, consequently, the inferences made about the parameter(s), and are genuine indications of

ambiguity and hesitancy about prior beliefs.

Some researchers have put great effort into determining the most suitable non-informative

distributions in various settings (e.g., Berger, 2006 [21]). Although a distribution reflecting an

absolute lack of sensitivity to the prior knowledge seems unachievable, several forms of priors

have been proven to exert least amount of effect on posteriors; uniform distributions are

predominantly adopted to account for prior ignorance. Alternatively, some choose

distributions like normal with considerably large variances (Ferson, 2004 [55]; O’Hagan, 2008

[102]; Rachev et al., 2008 [115]).


5.4. Merits of the Bayesian Methodology

96

Worthy of note is that one does not expect the inferences to be sensitive to the form of the

prior distribution if sufficient amount of data exist. In the presence of strong data, the posterior

is primarily determined by the data (i.e., the likelihood function). Each prior is updated in

sequence as a new instance of data arrives and given reasonable initial forms, all differing

priors would eventually tend toward a common posterior form; sequential updating gradually

improves individual priors and moves them from the primary assignment toward the true

posterior.


As specified earlier, one significant and distinctive aspect of the Bayesian approach is its

capacity to provide a formal and legitimate mechanism to hybridize the prior information with

the sample information upon availability of substantial amount of prior evidence – a feature

which leads to more informative posteriors and enhances the subsequent inferences. This merit

(i.e., use of subjective information) has simultaneously become the focus of the most heated

contention among Bayesians and frequentists. The criticism evoked by the latter party,

however, targets themselves as well since it fails to recognize the role of an analyst’s state of

mind in choosing a specific model or test; frequentists indeed take into account subjective

views, but in an implicit manner, when they devise a model, choose inference rules, speculate

on the value of the parameter(s), etc. Such subjective foundations are considered only natural

by the adherents of the Bayesian approach and they argue that it is much easier to perceive and

scrutinize the impact of subjective beliefs on the final outcomes when the framework

explicitly accounts for them; whereas, such an effect is less clear-cut in the frequentist setting

(O’Hagan, 2008 [102]).

Moving further in the discussion of subjectivity, it is worth stating that consolidating

subjective assertions with the notion of probability does not suggest that the Bayesian

framework is an ensemble of arbitrary ideas like irrational opinions and stiff prejudices.

Subjectivity implies exploiting pieces of information accumulated in the past and Bayesians



97

dismiss any other interpretation of subjectivity emerging from illusory prejudices rather than

facts. Simply put, the Bayesian philosophy is as subjective (or objective) as science is and

aligns more consistently with how scientific theories evolve. This is not to discount the

struggle of science to remain objective but to call attention to how it is yielded. Objectivity is

achieved through developing theories and debating the validity of them. The theories are

refined ceaselessly as new evidence accrues and, finally, converge to an accepted framework

(Howson and Urbach, 1996 [71]; O’Hagan, 2008 [102]). O’Hagan, (2008) [102] comments that

“objectivity in science is really a convergence of subjective opinion, and that agreement may

be only temporary”.

The Bayesian approach is also praised for its intuitive and easily interpretable inferences,

aside from the inherent notion of subjectivity. To clarify the content of this sentence, it is first

required to point out another important distinction between the Bayesian and frequentist

settings which arises from each method’s view of unknown parameters. Although the

Bayesian methodology is often applauded for its capacity to incorporate subjective knowledge,

it was discussed in Section 5.3 that this aspect may be suppressed in some situations by means

of non-informative priors. Even in the case of informative priors, if there are enough data, the

posterior would eventually turn insensitive to the prior. The most distinctive superiority of the

Bayesian approach is indeed its view of the unknown parameters.

The ultimate goal of many data analysis attempts remains deriving statistical inferences, or

‘learning’, about the unknown parameters – the domain where the fundamental deviation

between the two schools of thought occurs. The frequentist setting considers parameters as

fixed numbers where the concept of ‘uncertainty’ is not applicable. On the contrary, Bayesians

treat all the unknown parameters of a model as realizations of random variables. The main

interest of the latter, therefore, lies in eliciting the distributions associated with the respective

parameters. In a nutshell, the Bayesian approach is a fully probabilistic process of learning;

data are used to learn about the parameters and this knowledge is ‘updated’ continually in

view of new evidence.



98

The debate of considering parameters as random variables, rather than fixed numbers,

originates from the differences between two main sources/types of uncertainty: (i) ‘aleatory’

uncertainty (or ‘variability’), (ii) ‘epistemic’ uncertainty (or ‘incertitude’). The former refers to

the uncertainty associated with random variables, in the commonly acknowledged sense.

Sample data, which are generated from a truly repeatable underlying process, inevitably

embrace this kind of uncertainty. Since it arises from the inherent randomness of the system,

increasing the sample size would not reduce the magnitude of the aleatory uncertainty. The

epistemic uncertainty, on the other hand, originates from the modeler’s insufficient or

incomplete knowledge (paucity of knowledge) and can be lessened upon acquisition of more

data. Parameters belong in this latter category; to wit, the modeler remains uncertain regarding

the true value of a parameter and their uncertainty may be reduced and resolved as more

evidence accumulates.

The frequentist setting, which is underlain by the frequency interpretation of probability,

only measures aleatory uncertainties. Implicit in this property of the framework lies

frequentists’ view of the probability distribution of a random variable: it is a resolute construct

and frequentists do not recognize any uncertainty associated with its form; whereas, Bayesians

do not assign any definiteness to the location and dispersion of a probability distribution and

acknowledge that the shape of the distribution can be refined in view of new information. In

short, Bayesians amalgamate parameter uncertainty and data uncertainty (i.e., both types of

uncertainty) and view probability as a comprehensive concept which simultaneously addresses

all sources of uncertainty.

The aforementioned has laid the foundation to explain how the Bayesian setting facilitates

deriving intuitive conclusions which is recognized as its most significant benefit. It is provided

by the capacity to make inferences about the uncertainty of a parameter using the respective

(posterior) density. Inferences are broadly classified as point estimates (such as mean, median,

or mode) and interval estimates (such as credible interval). A considerable uncertainty in a

posterior distribution would, however, render the point estimates uninformative. Thus, the

concept of ‘credible interval’ provides a more suitable context for putting the aptitude of the



99

Bayesian framework into perspective. Credible interval is often considered as the analogue of

‘confidence interval’ in the classical statistics; yet, the two concepts differ critically in

philosophy, with the former identified as a more practical notion. A credible interval, which is

intelligibly obtained from the respective posterior distribution, represents a region whose

probability corresponds to a pre-specified value (e.g., 1 − Ƿ). In mathematical terms, the

interval [𝑙𝑙,𝑢𝑢] stands for the (1 − Ƿ)% credible interval of the parameter Ζ if

ℙ(𝑙𝑙 < Ζ < u|ұ) = �𝑓𝑓Ζ(𝜁𝜁|ұ)d𝜁𝜁𝑢𝑢

𝑂𝑂

= 1 − Ƿ (5.3)

The credible interval thus provides a range of likely values for Ζ and, more importantly, a

forthright response to a question like “In which interval does the parameter reside with the

probability (1 − Ƿ)% (e.g., 95%)”. It is tempting to rush conclusions and proclaim that the

notion of confidence interval addresses the same question; that is to interpret the statement of

“The (1 − Ƿ)% confidence interval of Ζ is [𝑙𝑙,𝑢𝑢�]” as “Ζ falls within the interval of [𝑙𝑙,𝑢𝑢�] with

the probability (1 − Ƿ)%”. However, this insight is incorrect since in classical statistics,

parameters are considered as fixed numbers. Thus, one cannot legitimately make probability

statements about them. The parameter Ζ either belongs (with the probability of one) or does

not belong (with the probability one) in the range of [𝑙𝑙,𝑢𝑢�]. The misconception has its roots in

failing to recognize that confidence interval is a random interval; that is, the lower (𝑙𝑙) and

upper (𝑢𝑢�) bounds are themselves realizations from the random variables 𝐿𝐿� and 𝑈𝑈�,

respectively. The essence of a (1 − Ƿ)% confidence interval and its precise meaning can be

delineated as follows:

If a sufficiently large number of sets of random samples are generated and for each set, the

confidence interval is computed following a given rule of inference, then we are confident that

(1 − Ƿ)% of these random intervals will include the ‘true’ value of the parameter. It is,



100

therefore, the rule that is under evaluation by means of repeated sampling. Such interpretation

is certainly less intuitive and hard to communicate with non-statisticians. It is likewise

cumbersome to derive meaningful conclusions in the risk framework, on the basis of the

preceding correct interpretation. Digging deeper, it becomes evident that confidence interval is

frequently mistaken for credible interval on the part of experts. It is indeed the credible

interval which can reasonably address what they seek to measure about the parameters (Huck,

2009 [73]).

It is readily observable that the hypothesis testing can be conducted more straightforwardly

in light of the concept of credible interval. As a case in point, suppose that the null hypothesis

is testing if the values of a parameter is at least zero. In the Bayesian setting, the probability of

the statement can be calculated based on the respective posterior distribution and the null

hypothesis would be accepted if the obtained probability exceeds a predetermined threshold

(O’Hagan, 2008 [102]). Contrast such intuitive perspective with the frequentist wisdom which

treats parameters as fixed numbers, revoking the validity of any probability statement about

them. The framework, therefore, has to develop a random variable (in the aleatory sense) with

a known density that is linked to the null hypothesis. It is then this random construct, referred

to as ‘test statistic’, which underlies the decision of rejecting or accepting the null hypothesis.

Finding the right test statistic in many settings complicates the problem. Moreover,

frequentists’ hypothesis testing suffers from the same limitation imposed by the confidence

interval: Few really understand it and it does not suit the framework of risk analysis (see, Page

99). When it is stated that the null hypothesis (e.g., Ζ ≥ 0) is rejected at the significance level

of, for example, 5%, a majority would interpret it as ℙ(Ζ ≥ 0|ұ) ≤ 0.05, or equivalently, the

probability that the null hypothesis is ‘true’, given the data, is equal to, or less than, 5%.

Similar to the case of confidence interval, such a perception contradicts the relative frequency

notion of probability since the null hypothesis and parameters, for that matter, are not part of a

repeatable experiment, and hence, probability statements cannot be tied to them (refer to

Section 5.2 for further details). The correct, yet less intuitive and practically transparent,

interpretation would illustrate that if the null hypothesis were ‘true’, the probability of


5.5. Merits of the Bayesian Approach in Risk Assessment

101

observing a test statistic whose absolute value is equal to, or greater than, the realized test

statistic must be higher than the pre-set statistical significance (in this case, 5%).

Indirect and distorted responses to inference questions is just one of the shortcomings of the

frequentist school of thought. Other contradictory traits are also associated with it, stemming

from the frequency definition of probability which demands repeatability and is sensitive to

the sampling structure or design of the experiment. The reader is encouraged to consult Berger

et al. (1988) [22] and O’Hagan (2008) [102] in this regard.


The contribution of the Bayesian methodology to different fields of practical reasoning

remained burdened by its high level of computational complexity; the process of computing

posteriors, particularly the normalization factor, can be cumbersome and impractical without

computers except for simple problems such as those with ‘conjugate’ priors. Since 1980s,

when powerful computing machines started to gain ground, the Bayesian approach is enjoying

an abrupt upward shift in its applications. It is particularly of interest in both financial and

non-financial risk management. Some statisticians even claim that it is the only probability

paradigm that can properly handle risk assessments. The remainder of this section allocates

more effort to navigate the aforementioned promises of the Bayesian wisdom in the field of

risk analysis.

The context of risk measurement and decision making is often identified by data scarcity –

a domain where the Bayesian approach comes in handy. By virtue of the notion of prior

distribution, Bayesians can make inferences even when empirical data are absent.

Furthermore, sophisticated tasks, like decision making, demand expert knowledge as well as

judgment and intuition to be formulated into objective mathematical terms. In the field of risk

management, where often limited data exist, not only is such expert intervention desirable but

essential. By synthesizing prior beliefs and new evidence, the Bayesian posterior distribution

greatly serves such requisites.



102

The capacity of the Bayesian approach to measure the epistemic component of uncertainty,

in addition to its aleatory constituent, is another feature that notably contributes to its

reputation as a suitable framework to pursue risk analyses. The following example

demonstrates how the frequentist approach may remain inferior to the Bayesian method in a

setting where it is sought to predict the future return of a portfolio. The future return of a

given portfolio can be considered as a random variable drawn from an underlying population.

Regarding the return as a random variable implies that the solution recognizes the aleatory

uncertainty associated with it. The uncertainty about future data is, however, primarily of the

epistemic type (O’Hagan, 2008 [102]); the parameters of the predictive density, conjectured to

express the likeliness of possible future returns, are themselves uncertain figures and this

aspect is not accounted for by the frequentists (see, Section 5.4, Page 98 for more details).

From the practical point of view, any type of uncertainty would in principle blur the

definiteness of the future and its impact on the final outcome must be jointly scrutinized with

that of any other source. On this account, a risk analyst or a decision maker is, in essence,

seeking a predictive distribution which would hybridize all sources of uncertainty and act as a

basis for deriving informed decisions and generating risk scenarios. The Bayesian framework

is the only setting which caters to such appeal. Bayesian thinkers are consistently concerned

with shaping their beliefs in the form of probability distributions which would in sequence be

blended with other uncertainties, leading to direct probability statements on different scales.

Heretofore, the focal intent of this chapter has been detailing the prominent advantages of

the Bayesian approach. The following discussion highlights and reviews certain features that

accomplished the recent rapid adoption of the methodology by many risk analysts. The list

appears in the study of Ferson (2004) [55]. A more comprehensive list praising the Bayesian

methodology can be located in Berry (1996) [24].

First. As detailed in the previous section, one benefit of the Bayesian framework is its

conformity with the process of inductive reasoning implemented by the human brain.

Inferences derived based on the Bayesian perspective are intuitively reasonable sets of

outcomes which address the original concerns posed by the analysts in a more direct and



103

concise manner. As discussed in Section 5.2, questions like “How likely is she to have

committed the crime?” cannot be addressed in the frequentist setting, which draws on the

notion of repetitiveness and demands a repeatable random underlying process; one cannot

design an experiment or a sequence of trials where she may be found guilty in some and not

guilty in the rest, and then, equalize the frequency of being judged guilty with the probability

of her committing the crime. Such questions are, nonetheless, welcomed under the Bayesian

philosophy since the setting caters to the interior monologue of the human brain; that is, in

light of the subjective interpretation of probability, the methodology parses the

aforementioned question and reforms it as follows: “what is our degree of belief that she is

guilty?”

Beside a vaster variety of problems that can be accommodated by the Bayesian approach,

its inferences (e.g., credible interval) comply more with the scientific reasoning, compared to

their so-called counterparts in the frequentist setting (in this case, confidence interval). Finally,

in further support of the intuitive nature of the methodology, one can refer to its ability to

absorb all the prior evidence in an explicit manner and revise and refine it as new evidence

arrives. The reader is encouraged to consult Sections 5.3 and 5.4 for more details on these

matters.

Second. The Bayesian philosophy better feeds into the process of decision making. Although

more intricate to develop, Bayesian inferences are formed in a setting where any uncertainty

can be expressed as a probability computed using the posterior density, which is the key tool

for conveying risk information. It is indeed the process of hypothesis testing which facilitates

decision making based on the posterior distributions. In the frequentist setting, decision

making is likewise conducted through hypothesis testing; however, as detailed earlier, the

philosophy of hypothesis testing in the classical setting is less intuitive and more susceptible to

logical errors. Furthermore, it is not admissible to associate probability statements with the

null and alternative hypotheses although they are direct indicatives of the analyst’s concerns

and best frame the examiner’s questions. Consequently, the analyst is only supplied with the

binary answer of ‘accept’ or ‘reject’ and faces a numeric value (i.e., the significance level)



104

which indeed specifies the reliability of the rule chosen for hypothesis testing, not the

conclusions laid out in the form of hypotheses (see, Section 5.4, Page 100).

Among the shortcomings that accompany hypothesis testing in the frequentist framework,

the following is perhaps the most overlooked one. The classical hypothesis testing incurs two

types of error: (i) Type I error: the probability of rejecting the null hypothesis when it is true,

(ii) Type II error: the probability of failing to reject the null hypothesis while it is false. Type I

error is often considered in the analyses and is perceived as an integral part of the classical

hypothesis testing; whereas, type II error comparatively poses less of a concern. In the context

of risk, however, quantifying the latter error is significant to the same extent as that of the

former. This condition introduces extra complexity with few benefits, as already established:

the computation of type II error requires defining a test statistic (and its corresponding density)

specific to the alternative hypothesis. The process demands choosing an exclusive statement

for the alternative hypothesis and would greatly complicate the implementation of the test.

Since the Bayesian approach tackles the task of decision making on a more fundamental

level, through assigning probability statements to the hypotheses, it is capable of charting a

direct and cognitive path to reach a decision. In addition, rather than yielding the binary

answer of ‘accept’ or ‘reject’, it leads experts to find a range for their degree of belief in the

accuracy of the hypotheses. In the Bayesian point of view, the decision-making route adopted

by frequentists is regarded as a detour; that is, classical statisticians first need to devise a

random process built upon the original decision question. Probabilistic statements can only be

made about this random process, and only in a recursive manner can experts reach inferences

on the original query.

Third. In the risk context, adherents of the Bayesian perspective further proclaim the

superiority of the methodology by drawing on its capacity to explicitly accommodate

subjective views in the process of decision making. Synthesizing the information provided by

the sample and the corresponding prior model into a posterior density, on which subsequent



105

scrutinies rest, is not only intuitive but also necessary and essential, particularly, when limited

data exist or most of the accessible information is subjective (see, Section 5.3).

The aforementioned features of the Bayesian approach have led to its most significant

application – Bayesian predictive inferences – due to which the methodology has gained

significant ground in different fields of finance, particularly portfolio management and market

risk management (Rachev et al., 2008 [115]). The ability of the Bayesian framework to derive

predictive inferences contributes greatly to its recognition as a tool to improve risk

assessments. Predictive inferences are derived based on the knowledge presented by the

‘predictive distribution’, defined as the distribution of the next datum given a realized

sequence of data and can be estimated using the (joint) posterior distribution of the

parameter(s) of the data generating process. The following equation, adopted from Rachev et

al. (2008) [115], exhibits the mathematical definition of the predictive distribution which serves

as a tool of great flexibility since not only can it be used to derive point predictions but

interval predictions as well:

ᵹҰ+(ұ+|ұ) = � 𝑓𝑓Ұ+(ұ+|𝜁𝜁)𝑓𝑓Ζ(𝜁𝜁|ұ)d𝜁𝜁𝜁𝜁

(5.4)

where, ұ+ stands for the next datum and it is assumed to be a random draw from 𝑓𝑓Ұ+(ұ+|𝜁𝜁).

The function 𝑓𝑓Ζ(𝜁𝜁|ұ) denotes the posterior distribution of the parameter Ζ, computed as

depicted by Equation (5.1). Note that 𝜁𝜁 represents a realization of Ζ which is treated as a

random variable in the Bayesian setting. By virtue of integrating over the values of the

parameter(s), the predictive distribution would solely depend on the past observations (i.e.,

past realizations of the random variable Ұ). In other words, the predictive density absorbs all

the uncertainty associated with the parameter(s) by averaging over their values. Any inference

derived from the predictive distribution would, therefore, reflect the epistemic uncertainty in



106

addition to the aleatory uncertainty which is accounted for by the sampling distribution. The

classical statistics fails to incorporate the former type of uncertainty. Instead, it conjectures on

the values of the unknown parameters and regards those assumptions as the true values of the

parameters which are then fed into the predictive density.

The increasing prominence of the Bayesian philosophy in the field of risk is indeed

nurtured by its capacity to yield predictive densities which combine all sources of uncertainty.

This characteristic can essentially benefit any analysis involving a series of unknown

parameters that cannot be determined with certainty. One of the fields of risk where the

Bayesian views have penetrated the consciousness of practitioners and academics is the mean-

variance analysis in portfolio selection. In this context, the expected value and variance of the

return of a portfolio are expressed as functions of two sets of parameters: (i) expected returns

of the assets comprising the portfolio (ii) covariances between the returns of the assets in the

portfolio. The classical mean-variance approach would proceed by finding an individual

estimate for each of these parameters based on the available samples and would rest the

subsequent computations on the assumption that these estimates represent the true values of

the parameters. However, since the approach does not recognize the estimation uncertainty or

ambiguity associated with the parameters, it is subject to significant errors in the optimal

weights selected for the portfolio assets. This issue can be alleviated using the approach

described by Equation (6.4); that is, the multivariate predictive distribution of the future

returns of the assets can be obtained by integrating (averaging) over the values of the

aforelisted parameters (i.e., expected returns of the assets and covariances between the returns

of the assets). The predictive density yielded by this process would be independent of the

parameters. The reader is referred to Rachev et al. (2008) [115] for detailed mathematical

descriptions.

Other fields of finance adopt the Bayesian wisdom along similar lines. It has emerged as

the key inference technique in other frameworks of optimal portfolio selection, predicting

returns, gauging market efficiency under different measures, and volatility modeling, just to

name a few. It lies beyond the scope and expertise of this thesis to elaborate on all the listed



107

applications. In the interest of clarity and thoroughness, one additional example will follow

which aligns closely with the objectives pursued by this research.

Conducted by Siu et al. (2004) [134], the study implements the VaR inference procedure in

two cases of linear and non-linear portfolios and demonstrates how the Bayesian approach

improves the inferences. The methodology involves choosing a model which intends to

describe the stochastic process of the portfolio value and treats its underlying parameters as

random variables. The posterior joint distribution of these parameters is then computed by

hybridizing the objective knowledge inferred from the historical data and the subjective

knowledge (i.e., a prior density) provided by the authors. The estimated posterior joint

distribution is subsequently utilized to find the predictive distribution of the future portfolio

value which in turn serves as the basis of VaR calculation. The VaR figure can be updated in

sequence as more data accrue. The outcomes of the research indicate the superiority of the

Bayesian VaR model to its classical counterparts and regards the Bayesian VaR model as a

‘good alternative’ to the classical approaches.

As a closing remark for this chapter, it is possible to observe that the features of the

Bayesian framework offer a unique setting for dealing with the limits of the risk analysis of

IPOs. Essentially, there exist limited data, yet often strong beliefs regarding performative and

ostensive aspects (Feldman and Pentland, 2003 [54]) of individual IPOs. Such prior views are

typically formed by scrutinizing the counterpart aspects of the comparables of the respective

IPO. The Bayesian methodology corresponds such a setting in many respects and would serve

as a key tool for deriving Bayesian VaR inferences about IPOs. To the best of the author's

knowledge, to this day, this research avenue has remained unexplored. Data paucity has

slackened risk estimations particularly through mathematically sensible approaches and has

rendered some of the existing attempts ad hoc or inadequate due to their case-specific or

extremely simplifying assumptions (e.g., Abdou and Dicle, 2007 [1]; Arnold et al., 2010 [10]).

This study is the first of its kind whose focal intent is to unravel the IPO risk analysis from the

Bayesian perspective. Sections 6.3 and 6.4 detail the steps involved in developing this phase of

the methodology.

108

Chapter 6. METHODOLOGY

6.1. Review of Objectives

The main objective of this thesis is to quantify the short-term risk of investing in IPOs.

Here, ‘risk’ is defined as a value-at-risk measure which is computed based on the distribution

of the IPOI’s stock price 90 trading days subsequent to the day of issuing (PDF90IPOI). The

comprehensive literature review in Section 3.2 established that using comparables would be a

prudent approach to quantify the short-term risk of investing in IPOs. Based on the research

presented, the comparables approach is predominantly used to set target prices and provides

estimates with roughly identical accuracy to other mathematically intensive valuation

approaches. The prevalence and strength of the method in projecting future prices indicates

that it can, likewise, be a supreme approach to finding the distribution of prices at a given time

in the post-IPO market.

The methodology, which fulfills the main objective, pursues the following specific goals:

(i) Phase I: The IPOI’s comparables are identified and the IPOI’s similarity to each of

them is quantified through developing a ‘distance function’ (see, Section 6.2).

(ii) Phase II: PDF90IPOI is estimated in a recursive Bayesian framework which utilizes the

outcomes of Phase I (see, Section 6.3).

(iii) The ‘distance function’ is calibrated by backtesting, using a recursive-cyclic

algorithm. It iteratively runs Phase I and Phase II across different IPOIs and draws

upon the Bayesian framework to find the optimum model parameter whose

likelihood is maximum given the realized market prices (see, Section 6.4).

6.2. Phase I: Comparable Selection

In this research, the comparables method forms the building block of the short-term risk

estimation. Therefore, a more accurate screening of comparables would eventually translate

Chapter 6. Methodology 6.2. Phase I: Comparable Selection

109

into improved risk estimates. This research seeks a process which facilitates a multi-

dimensional comparison of firms in an objective manner. Section 3.3 presents a detailed

review of the existing methodologies for identifying comparables and lists any shortcoming

that may make the respective technique incompatible with the specific objectives of this

research. To put into brief terms, the industry-based approaches can be considered the most

prevalently applied methods of comparable selection; however, they indeed approximate the

performance of the firm of interest with the overall characteristics of the respective industry,

disregarding firm-specific details such as operating and financial features. Moreover, most of

these techniques do not have the capacity to simultaneously compare multiple aspects of the

operations of the firm of interest with the corresponding mean industry characteristics.

Further review of the literature returned two studies which base the selection of

comparables on multiple idiosyncratic accounting and financial factors: (i) Bhojraj and Lee

(2002) [25], (ii) Simak (2000) [133]. Bhojraj and Lee’s (2002) [25] methodology is regression-

based, and consequently, encompasses the shortcomings associated with regression such as ex-

ante definition of the form of the function. Furthermore, the methodology is designed such

that it requires market information which does not exist in the case of IPOs. Section 3.3

outlines the technique and exhaustively discusses further limitations associated with it (see,

Pages 40-44).

Simak’s (2000) [133] study adopts the DEA framework to perform a multi-dimensional

comparison of characteristics across different firms. The methodology and its merits and

demerits are discussed at great length in Section 3.3 (see, Pages 44-47). The comparable

selection algorithm proposed in this thesis is motivated by the principles of Simak’s (2000) [133] work. As discussed earlier, the DEA approach is a non-parametric approach which

endogenously establishes a relationship between the inputs and outputs, and thus, prevails over

the limitations imposed by the regression analysis. The DEA approach also accounts for the

economies of scale and would choose comparables such that they have the same economies of

scale as the firm of interest in addition to resembling it in terms of financial performance.


110

Table 6-1. Input-Output Data Used by the CCR Model in an Input-Oriented Analysis

DMU Input 1 Input 2 Output

A 4 3 1

B 7 3 1

C 8 1 1

D 4 2 1

E 2 4 1

F 10 1 1

The comparables are ideally expected to resemble all aspects of the firm of interest;

however, in practice, such comparables do not exist. This problem can be mitigated by

reducing the comparison criteria to limited pre-determined factors; yet, the complication

continues to hold since it is hard to achieve a comparable firm which matches the firm of

interest in all such factors. Thus, some dimensions must be assigned higher priority, and

hence greater weights, in the comparison. The challenge lies in identifying these dimensions

and the respective weights. Perhaps, the most significant benefit of a DEA-based comparison

technique comes from the capacity of DEA to endogenously determine such weights.

As indicated in Section 3.3, a DEA-based comparison method selects comparables from a

set of firms which share identical ‘efficient peers’. This approach can be justified as follows:

The weights received by inputs and outputs of an inefficient DMU are similar to the weights

assigned to the inputs and outputs of the respective efficient peers. Hence, DMUs which are

associated with the same ‘reference set’ would dispense similar levels of significance to the

respective inputs and outputs. This point can be further illustrated using the ‘primal’ form of a

simple input-oriented CCR (or CRS) model. The data are collected from Example 2.2 of

Cooper et al. (2007) [41] and are presented in Table 6-1.


111

The following six equations present the main constraints of the ‘primal’ input-oriented

CCR model:

A: 𝑊𝑊𝑂𝑂 ≤ 4𝑊𝑊𝐼𝐼1 + 3𝑊𝑊𝐼𝐼2 (6.1)

B: 𝑊𝑊𝑂𝑂 ≤ 7𝑊𝑊𝐼𝐼1 + 3𝑊𝑊𝐼𝐼2 (6.2)

C: 𝑊𝑊𝑂𝑂 ≤ 8𝑊𝑊𝐼𝐼1 + 1𝑊𝑊𝐼𝐼2 (6.3)

D: 𝑊𝑊𝑂𝑂 ≤ 4𝑊𝑊𝐼𝐼1 + 2𝑊𝑊𝐼𝐼2 (6.4)

E: 𝑊𝑊𝑂𝑂 ≤ 2𝑊𝑊𝐼𝐼1 + 4𝑊𝑊𝐼𝐼2 (6.5)

F: 𝑊𝑊𝑂𝑂 ≤ 10𝑊𝑊𝐼𝐼1 + 1𝑊𝑊𝐼𝐼2 (6.6)

Here, the parameters 𝑊𝑊𝐼𝐼1, 𝑊𝑊𝐼𝐼2 and 𝑊𝑊𝑂𝑂 denote the weights of Input 1, Input 2 and the Output,

respectively. The constraints define the shared feasible region as the half-space lying above the

efficient hyperplanes, C, D and E (see, Figure 6-1). Figure 6-2 complements Figure 6-1 by

adding the hyperplane of one of the inefficient DMUs (i.e., DMU A). Note that the hyperplane

A bears the closest proximity to the efficient hyperplanes at the intersection of hyperplanes D

and E and this demonstration visualizes why DEA chooses DMUs D and E as the ‘efficient

peers’ of DMU A. Figure 6-2 also describes why DEA would choose similar final weights for

DMUs A, D, and E.

Prior to presenting the DEA-based model developed in this thesis for choosing

comparables, Section 6.2.1 engages in describing the pool of candidates for acting as the

comparables of the IPOI. Section 6.2.2, subsequently, establishes the list of inputs and outputs.

The list of inputs and outputs are scrutinized in advance of outlining the model since its

structural form would be impacted by the characteristics of these inputs or outputs such as

positivity or negativity or the degree of discretionarity associated with each.


112

Figure 6-1. This figure demonstrates the efficient hyperplanes determined by the constraints presented in

Equations (6.1) to (6.6). The shared feasible region lies above the efficient hyperplanes.


113

Figure 6-2. This figure complements Figure 6-1 by adding the hyperplane of DMU A which is represented by

Equation (6.1). DMU A is an inefficient DMU, and thus, its hyperplane lies below the hyperplanes of the

efficient DMUs C, D, and E. The set of ‘efficient peers’ of DMU A comprises DMUs D and E. Note that the

intersection of DMUs D and E is where the hyperplane A has the closest proximity to the efficient frontier. DEA,

therefore, determines similar final weights for DMUs D, E, and A.


114

6.2.1. Pool of Candidates

In this research, the pool of past IPOs is screened in order to find suitable comparables for

the IPO of interest (IPOI); that is, this study searches for the firms which were comparable to

the IPOI at their issue time. Choosing past IPOs as comparables offers a significant advantage

over selecting concurrent public firms to act as comparables. It enables the methodology to

control for IPO-specific conditions which impact the IPO offer price and short-term post-IPO

market prices. Note that the risk analysis proposed in this thesis draws on the post-IPO market

histories of the comparables in order to estimate the distribution of the IPOI’s stock price 90

trading days subsequent to the issuing day (see, Section 6.3). Past IPOs would provide a more

trustworthy basis for approximating such a distribution since IPO-specific characteristics are

reflected in their post-IPO market histories. Some of these characterises are listed below:

(i) Underwriters positively affect the success of IPOs. They gauge market sentiment

through road shows and underwriter-specific networks and can exercise

considerable latitude to shape the final offer price based on the investors’ perception

of the respective stock. Furthermore, underwriters may positively impact the

market’s interpretation of financial statements and other disclosures.

(ii) Earnings management is another prevalent phenomenon in IPOs. Issuers take

advantage of income increasing accounting choices in order to represent themselves

in the best light possible.

(iii) Issuers often enter into a price stabilization agreement with the underwriters; that is,

during a ‘relatively brief time limit’ (Geddes, 2003) [57], the underwriter may engage

in share repurchase, at the IPO offer price, in order to balance the excessive selling

pressure in the immediate post-IPO market (see, Section 2.5.2).

The impact of all the aforementioned practices and alike would be taken into account in the

risk analysis if the IPOI is compared to other IPOs instead of contemporary public firms.

In addition to controlling for structural characteristics of IPOs, limiting the pool of

candidates to past IPOs would guarantee access to sufficient historical market data for the


115

comparable IPOs. Such data indeed represent the post-IPO fate of them and can be used to

estimate the distribution of the stock price for the respective IPOI, at the time point of 90

trading days after the issuing day (see, Section 6.3).

To further support the notion, this research submits that an IPOI and its prospective

comparable would be matched more closely in terms of growth potential and risk if the set of

comparables to IPOI comprises past IPOs, as opposed to concurrent public firms. An IPOI

may be in accordance with a public firm with respect to accounting and financial

fundamentals. Yet, different perceptions of future performance may be associated with each of

them, particularly if the public firm is considered to be at its equilibrium stage. Therefore, in

light of the differences in investors’ perspective of the growth potential and risk of the two

firms, it would be more prudent to compare the IPOI to other IPOs. Such approach can be

supported by the work of Teoh et al. (1998) [135] and Kim and Ritter (1999) [87]. In different

settings, they reported improved valuation outcomes when the IPO under scrutiny is matched

with other recent IPOs or past IPOs from the same industry.

In conclusion to this section, it is worth stressing that the data of the candidate IPOs would

belong to different points in time. To further elaborate on the pool of candidates and the data

set-up, a snapshot of the data is presented in Figure 6-3. In this case, the IPOI took place in

1997. The DEA model would select the comparables from the pool depicted below which

comprises the IPOs preceding the IPOI. The inputs and outputs are, therefore, adjusted for

inflation choosing the issue year of the IPOI as the base year. Furthermore, as explained in the

following section, GDP and the sector-specific index would control for economic productivity

and the performance of the respective sector that can impact other DEA inputs and outputs

such as net income.

6.2.2. Variable Selection

The comprehensive literature review presented in Section 3.4 (see, Pages 47-52) identifies

key value indicators in the market. The multi-dimensional comparable selection can be

founded on these factors. The survey demonstrates the significant impact of accounting


116

variables on contemporary market prices and how recent accounting information can shape

investors’ expectations of future. It is perceived that there is a basis for the bids in the primary

(pre-IPO) and secondary (post-IPO) markets and investor pricing is significantly correlated by

the immediately preceding financial performance of the respective firm. Factors deemed as

key value drivers can be summarized as shown below:

(i) Revenue

(ii) Gross profit

(iii) Selling, general and administrative expenses

(iv) Net income

(v) Shareholder’s equity

(vi) Operating cash flow

Figure 6-3. A Snapshot of the Pool of Candidates for a Given IPOI in the ‘Health Care’ Sector.


117

The choice of DEA inputs and outputs in this research draws upon the above-indicated

factors. The input-output classification of these elements is performed based on the following

principle: A factor is considered an ‘input’ (‘output’) if the efficient peers of a hypothetical

inefficient firm are expected to operate under reduced (increased) levels of the respective

factor. As a case in point, net income fits into the output category since an efficient firm is

expected to earn more income compared to its inefficient peers.

As discussed thoroughly in Chapter 4, DEA is a non-parametric technique which does not

require a priori knowledge of the functional form of the production process. However, inputs

and outputs should be selected appropriately so as to express the true production process. To

meet this criterion and based on the pre-IPO data availability, the inputs and outputs are

configured as follows (see, Figure 6-4):

Inputs

- Capital Expenditure (CAPX)

- Change in Working Capital Other than Cash (Change in WCOC)

- Debt in Current and Long-Term Liabilities (DCL)

- Gross Domestic Product

- S&P Sector Index

Outputs

- Operating Income before Depreciation (EBITDA)

- Net Income (NI)

- Operating Cash Flow (OANCF)

- Change in Cash and Cash Equivalents (CHECH)

- Shareholders’ Equity (SE)


118

Note that for each IPO, the point factors pertain to the date of the last audited annual

statements released before the offering. Any period accounting factor, or any accounting

variable representing a change, is extracted from the two consecutive audited annual financial

Figure 6-4. General Layout of the DEA Model


119

statements which immediately preceded the issue date. The granular discussion of data is

provided in Chapter 7. The following points are worth remarking regarding the inputs and

outputs:

Note 1. Annual Capital Expenditure (CAPX):

This input accounts for investment in fixed assets (i.e., property, plant and equipment).

Note 2. Change in Working Capital Other than Cash (ΔWCOC):

In addition to investments in long-term assets, represented by CAPX, a firm likewise

invests in net current assets or ‘net working capital’ (Ross et al., 2010 [122]). The

amount of such investment is represented by ΔWCOC. Equation (6.7), shown below,

presents the components which form ΔWCOC.

A positive value for ΔWCOC would imply that the firm has experienced a cash

outflow for the respective period. This cash outflow can be considered an investment

in the operations. Note that ΔWCOC does not include the changes in the ‘cash’

account which also belongs to the class of ‘current assets’. Due to the significance of

cash, most analysts tend to address this component independently (Ross et al., 2010

[122]). Considering that the increase in cash is the ultimate incentive of any business,

the change in ‘cash and cash equivalents’ (CHECH) is treated as a DEA output in this

study. CHECH represents the net cash flow from all operating, investing, and

financing activities in the respective period. It is important to note that a positive value

of CHECH would imply a cash inflow.

Change in Working Capital Other than Cash (6.7)

= Accounts Receivable: (Decrease) Increase

+ Inventory: (Decrease) Increase

+ Accounts Payable and Accrued Liabilities: (Increase) Decrease


120

+ Income Taxes – Accrued: (Increase) Decrease

+ Assets and Liabilities: Other (Net Change)

Note 3. Debt (DCL) and Stockholders’ Equity (SE):

Financial leverage can raise the return to common shareholders, and hence, it is often

desirable to include some debt in the capital structure provided that the yielded return

on assets exceeds the after-tax cost of debt. Elevated debt levels, however, would

increase the bankruptcy risk as well as the credit risk, burdening further access to the

capital market. One measure that puts the level of debt into perspective is the debt-to-

equity ratio. It can be used to examine a firm’s long-term solvency risk and assess how

risky its capital structure is. The two constituents of this ratio should, therefore, be

factored in for identifying comparables for an IPOI. To further elaborate on these two

factors, it must be stated that in this research, DCL aggregates long-term debt and its

short-term component which is due in one-year (i.e., long-term debt due in one year).

Shareholders’ equity represents the total claim on the assets of a firm and consists of

the two accounts of preferred equity and common equity.

Note 4. Gross Domestic Product (GDP) and the S&P 500 Sector-Specific Index:

As noted earlier, the firms participating in the DEA analysis do not belong to the same

point in time. These two factors, therefore, proxy the state of the economy and the

mode of the respective sector at the issuing times of individual firms.

In the language of DEA, GDP and the sector index can be considered ‘non-

discretionary’ factors since they do impact the performance of different DMUs; yet,

they cannot be varied at the discretion of the analyst or the management team. The

reader is referred to Section 4.4 for a thorough discussion on this subject. Drawing on

the principle of input-output classification, introduced above, GDP and the sector

index are treated as inputs. This set up would imply that the efficient peers of a given

inefficient firm are expected to operate in a similar or inferior environment. Reversing


121

the logic would suggest that a group of firms functioning in a more favorable

environment, and hence, yielding superior outputs would not form a suitable set of

efficient peers. The reason is that due to adverse environmental conditions, the

corresponding inefficient firms may not be able to achieve the improved targets which

are set by the efficient peers.

Note 5. S&P 500 Sector-Specific Index:

The S&P 500 sector-specific indices are computed using the large-cap companies in

the respective sectors. In this study, each sector comprises companies with different

sizes of market cap. Therefore, the ‘S&P 1500 composite’ sector-specific index would

be more representative of the state of the respective sector since it is derived based on a

combination of small-, mid- and large-cap companies. However, the scrutiny of S&P

1500 composite indices in all S&P sectors indicates that the data on these indices do

not cover the entire study period (1990-2012) and are supplied only after 1994 in the

COMPUSTAT database (see, Section 7.2). Therefore, the analysis is conducted using

the S&P 500 sector-specific index.

Note 6. Operating Income Before Depreciation (EBITDA):

Commonly referred to as EBITDA, this output denotes ‘net sales’; that is, ‘sales’ net of

‘cost of goods sold’ and ‘selling, general and administrative expenses’.

As a closing remark for this section, it should be reiterated that the DEA model in Figure

6-4 can be viewed as a black box which utilizes DCL, CAPX, and ΔWCOC – investments in

long-term and net current assets – to yield net income, operating cash flow, and the other

elements shown in Figure 6-4.

6.2.3. The DEA Model

At the onset of this section, it is worth restating the aim of this DEA model. It pursues the

objective of identifying the firms which were comparable to the IPOI at their issuing times.


122

Each DMU is represented by one point in the 10-dimensional space of inputs and outputs. The

DEA model would map these points into a space with coordinates of 𝜆𝜆 and 𝜓𝜓. The parameter

𝜓𝜓 represents a new efficiency measure. It is determined as the value of the objective function

in the new DEA model (introduced below). It, however, continues to act as a measure of the

distance from the efficient frontier. Using two DEA parameters, 𝜆𝜆 and 𝜓𝜓, a new ‘similarity’

metric would then be developed to gauge the degree of similarity of these comparables to the

IPOI.

The scrutiny of DEA inputs and outputs in Section 6.2.2 establishes that the problem

encompasses non-discretionary factors as well as negative numbers. The DEA model is, thus,

required to address these two characteristics of the data. The following two subsections

(Sections 6.2.3.1 and 6.2.3.2) are focused on these aspects of the DEA analysis. In addition, in

this study, the efficiency score approximates the distance from the ‘fully efficient’ frontier. It

should, therefore, incorporate all sources of inefficiency. To meet this prerequisite, this

research will choose a slacks-based model. The radial models would not make suitable

candidates to serve this goal since in this study, the efficiency score of ‘1’ should indicate that

the firm under assessment is positioned on the ‘strong defining or fully efficient hyperplanes’

(Seidl and Sommersguter-Reichmann, 2011 [129]); whereas, in the case of radial models, the

efficiency score of ‘1’ is the ‘necessary’, but not the ‘sufficient’, condition to situate a firm on

the fully efficient hyperplanes. Under these models, a firm associated with the efficiency score

of ‘1’ may be located on the weakly efficient hyperplanes due the unaccounted for slacks.

6.2.3.1. Negative Data

In this thesis, some DEA inputs and outputs comprise a mix of negative and positive

numbers. Thus, the DEA model is expected to have the capacity to cope with this

characteristic of the data. The DEA framework was originally developed in the context of non-

negative inputs and outputs. Section 4.3 (see, Pages 64-79) comprehensively reviews the

approaches suggested for extending this framework to negative numbers. These methods can

be broadly classified as follows:


123

(i) data translation

(ii) treatment of negative outputs (inputs) as positive inputs (outputs)

(iii) individual models such as ‘semi-oriented radial measure’ (SORM) proposed by

Emrouznejad et al. (2010) [49] or ‘variant of the radial measure’ (VRM) suggested by

Cheng et al. (2013) [38].

(iv) range directional models

Section 4.3 details these approaches and presents the respective shortcomings. Here, the

shortcomings associated with the second approach are revisited. Due to its simple theory, this

approach has gained a wide application; however, it is associated with significant limitations.

The SORM model was derived based on similar principles; that is, the model disaggregates the

vector of each input and output into two sub-vectors of positive and negative numbers. Thus,

in addition to the model-specific limitations, noted in Section 4.3.3, the SORM model would

suffer from similar defects.

Method (ii), indicated above, is mathematically impaired due to the following results:

(i) The most apparent shortcoming is the artificial increase in the number of inputs or

outputs which can adversely impact the optimization problem. This limitation

would be more pronounced in the case of sparse DMUs engaging in the analysis.

(ii) The most fundamental shortcoming associated with this approach can be explained

by referring to the basic goal pursued by DEA: It is employed to evaluate the

relative performance of different firms or decision making units. The performance

of each DMU can be represented by an efficiency score bounded between ‘0’ and

‘1’, with ‘1’ indicating ‘best performers’ which form a piece-wise linear frontier

and ‘envelope’ the inefficient DMUs. The best performers not only serve as

references to identify sources of inefficiency but also as benchmarks for the

inefficient DMUs to strive for. Dividing an individual input or output into two sub-

vectors would lead to the targets and efficiency scores which are fundamentally

different from what could have been achieved if the input or output vector were kept

intact, integrating both negative and positive numbers.


124

This point can best be substantiated by the following example. Consider ‘net

income’ as a DEA output. This example seeks maximizing the net income, and it is,

thus, treated as an output. Suppose a percentage of the DMUs under scrutiny have

reported negative incomes for the period. Following the above-indicated steps

would lead to two sub-vectors. The sub-vector containing positive net incomes

would be considered as an output; whereas, the sub-vector including the absolute

values of the negative net incomes would be treated as an input since the problem

seeks minimizing the absolute values of these negative numbers. The best target that

the above-indicated DEA model could possibly set for this latter sub-vector would

be ‘zero’ net income. The reason is that the positive net incomes are located on the

output side, and from the pure mathematical point of view, bear zero relevance to

those on the input side. The optimization problem, therefore, does not consider the

DMUs with positive net incomes as potential targets for the DMUs reporting

negative incomes.

Using the same logic, it can be inferred that the efficiency scores and other

optimization results would, likewise, be fundamentally different for the DMUs with

positive net incomes. In analyzing these DMUs, the optimization problem assumes

that the minimum possible net income is ‘zero’. Based on the points highlighted

above, it can be readily asserted that allocating negative and positive numbers to

separate sub-vectors fundamentally changes the nature of the problem and puts

forward an essentially different affair.

(iii) To further comprehend the complications imposed by splitting an input or output

based on the sign, one can again refer to the original theory of DEA. Known as one

of the most significant functions of DEA, it unfolded the possibility of

simultaneously addressing multiple dimensions of a DMU’s activity through its

capacity to accommodate multiple inputs and outputs. In practice, it is perhaps

hardly feasible for a DMU to simultaneously minimize all its inputs and/or

maximize all its outputs. The DEA offers guidance as to which inputs and outputs

should receive prime attention. In the example provided above, if the vector of net


125

income remains intact, a DMU’s negative income would be collectively analyzed

with the rest of its inputs and outputs. The DEA may deem other inputs or outputs

more effective on the firm’s operation, and hence, its results may advise the firm to

focus on those aspects to improve its performance.

(iv) The final shortcoming regarding this approach concerns ‘zero’ place holders. Refer

back to the example of ‘net income’ mentioned above. The presence of zeros which

simply act as the place holders of negative incomes would skew the optimization

outcomes; the DEA would tend to choose the DMUs associated with these zeros as

efficient units, particularly in an input-oriented model, since they have indeed

achieved the minimum value possible.

As concluded in the literature review of Section 4.3, the ‘range directional’ models – RDM

and MSBM suggested by Silva Portela et al. (2004) [132] and Sharp et al. (2007) [131],

respectively – offer the most comprehensive framework to cope with negative data,

particularly, in cases comprising a mix of positive and negative numbers. The range

directional models hold the characteristic of translation invariance while preserving the unit-

invariance feature as well as the true production process. In addition, the efficiency scores

yielded by these methods can be readily interpreted similar to the efficiency scores of

conventional DEA models such as BCC, CCR and SBM. Thus, the efficiency scores resulted

from the range directional models would suit the function considered for the efficiency score

in this thesis; that is, the efficiency score of a firm acts as an approximation for its distance

from the efficient frontier. It should, however, be noted that the RDM model is a radial model.

As detailed above, the efficiency score provided by such models may not incorporate all

sources of inefficiency. It may, therefore, represent the distance from a ‘weakly efficient’

hyperplane; whereas, this study aims at gauging the distance of any firm from the ‘strongly

efficient’ hyperplanes. Considering the limitations posed by radial models, this research would

draw on the slacks-based model of MSBM (Sharp et al., 2007 [131]) to estimate the similarity

metric. The model is represented below by Equations (6.8) to (6.12):


126

Model 1: Modified Slacks-Based Measure [131]


�1 − 1𝑚𝑚∑ 𝑠𝑠𝑖𝑖−

𝑅𝑅𝑥𝑥𝑖𝑖𝑜𝑜𝑚𝑚𝑖𝑖=1 �

�1 + 1𝑠𝑠 ∑

𝑠𝑠𝑟𝑟+𝑅𝑅𝑦𝑦𝑟𝑟𝑜𝑜

𝑠𝑠𝑟𝑟=1 �

(6.8)

Eq. 6 [131]



𝑒𝑒𝜆𝜆 = 1 (6.11)

𝜆𝜆 ≥ 0, 𝑠𝑠− ≥ 0, 𝑠𝑠+ ≥ 0 (6.12)

The notation of this model is consistent with the one presented in Chapter 4. There exist 𝑚𝑚

inputs, 𝑠𝑠 outputs and 𝐽𝐽 firms. The parameters 𝑅𝑅𝑥𝑥𝑖𝑖𝑜𝑜 and 𝑅𝑅𝑦𝑦𝑟𝑟𝑜𝑜 denote ‘ranges of possible

improvement’ for input 𝑖𝑖 and output 𝑟𝑟 of DMUo, respectively and can be expressed as follows:

𝑅𝑅𝑥𝑥𝑖𝑖𝑜𝑜 = 𝑥𝑥𝑖𝑖𝑜𝑜 − min𝐽𝐽�𝑥𝑥𝑖𝑖𝑟𝑟�

(6.13)

𝑅𝑅𝑦𝑦𝑟𝑟𝑜𝑜 = max𝐽𝐽

�𝑦𝑦𝑟𝑟𝑟𝑟� − 𝑦𝑦𝑟𝑟𝑜𝑜 (6.14)

where, 𝑟𝑟 = 1, … , 𝑠𝑠

𝑖𝑖 = 1, … ,𝑚𝑚

𝑗𝑗 = 1, … , 𝐽𝐽


127

Here, 𝑅𝑅𝑥𝑥𝑖𝑖𝑜𝑜 and 𝑅𝑅𝑦𝑦𝑟𝑟𝑜𝑜 are indeed equivalents of 𝑔𝑔𝑥𝑥𝑖𝑖𝑜𝑜 and 𝑔𝑔𝑦𝑦𝑟𝑟𝑜𝑜, presented in Equations (4.68)

and (4.69). The reader is referred to Section 4.3.5 (see, Pages 74-79) for further details on the

mechanism of the range directional models.

6.2.3.2. Non-Discretionary Factors

The problem at hand includes non-discretionary (ND) inputs: GDP and sector index. As

stated earlier, the comparable firms are expected to resemble the IPOI in terms of discretionary

factors such as period net income and net working capital (see, Section 6.2.2). These

comparable firms are likewise expected to operate in a similar environment. To be more

particular, consider a firm with a similar or higher net income which is compared to the IPOI.

The figure of net income encompasses the impact of the market. Thus, if the firm is running

during a ‘hot’ period of a sector, its net income is indeed not comparable to the net income of

the IPOI operating in a ‘cold’ period.

Section 4.4 comprehensively reviews the DEA studies which focused on the ND factors.

The objective functions suggested by these studies bear no direct relevance to the ND

variables. These factors impact the objective function only indirectly by means of their

presence as the constraints of the optimization program. Such approach is supported by the

viewpoint that the ND factors are exogenously fixed and cannot be enhanced at the discretion

of the management team. Thus, such factors should not be incorporated into the efficiency

score which provides a consolidated insight into the aspects which can be improved. The DEA

models addressing the ND factors can be classified as below (see, Section 4.4 for further

details).

Non-Radial Models

(i) The slacks associated with the ND factors are set to be zero which would a priori

obligate the respective dimensions to be on the efficient frontier.

(ii) The constraints of zero ND slacks are relaxed; yet, the ND slacks are not allowed in

the objective function.


128

The first group ensures that the firm under scrutiny matches its efficient target with

respect to the exogenously fixed ND factors. In theory, this class may appear to

offer the supreme solution to the problem of ND variables. Further scrutiny of this

class, however, revealed that the optimizations would often be infeasible due to the

strict constraints. This point is further discussed below. It was also observed that in

most of the cases where the optimizations were executed successfully, these models

tend to set the firm under investigation as its own target. Such result is often

expected with stringent constraints.

Under the second class, the ND slacks are prone to be set such that significant

differences exist between the environments of the firm under scrutiny and its peers.

Radial Models:

This setting accommodates the characteristics of the ND factors by excluding the

radial efficiency parameter from the ND constraints. Thus, the ND variables cannot

be improved radially. If the analysis involves the second phase which is slack

maximization (see, Section 4.2), this group would suffer from the same

shortcomings associated with the non-radial models.

This study searches for a methodology which chooses the efficient target such that it

coincides with the respective firm of interest in terms of the ND factors. The methodology

should, however, resolve the infeasibility issue caused by too many stringent constraints,

forcing the ND slacks to be zero. To the best of the author’s knowledge, no DEA study has

accomplished the aforementioned objectives. The subsequent discussion outlines the

methodology proposed by this study to tackle this problem. It should be noted that this

discussion assumes ‘variable returns to scale’ (see, Section 4.1) since the analysis includes

firms of different sizes and scales.

The DEA analysis can be viewed as carrying out the following two tasks simultaneously:

(i) outlining the efficient frontier, and hence, the efficient firms, (ii) identifying the efficient

peers and computing the efficient targets for respective inefficient firms. To execute these two


129

tasks, DEA would run 𝐽𝐽 optimizations given that the analysis includes 𝐽𝐽 DMUs. If the efficient

frontier (i.e., the convex hull of the existing DMUs) were provided in advance, the analysis

would then transform into a new problem: a mapping problem. This problem would address

how to map the inefficient DMUs onto the convex hull enveloping them and how to determine

the appropriate efficient targets for the respective inefficient DMUs. The constraints in the

DEA model indeed set the mapping rules. To provide examples, consider Equations (6.9) and

(6.10). These two equations indicate that the efficient target of DMUo is the one which utilizes

less inputs and produces more outputs than DMUo. These criteria would be met by more than

one point and it is the objective function that specifies which target should be selected. The

optimization program presented by Equations (6.8) to (6.12) tends to choose the farthest target

which satisfies all the constraints.

The previous paragraph clarifies that one can distinguish between the determination of the

efficient frontier and conducting the efficiency analysis, both of which are carried out

concurrently by DEA. To further clarify the point of the previous paragraph, it is important to

note that the efficient frontier (i.e., the convex hull) can be obtained solely based on the

geometrical properties of the data, with no need for 𝐽𝐽 optimizations. Based on this

interpretation, it can be concluded that if the structure of the convex hull is known, then by

means of an optimization problem, one can identify the desired efficient target for the

inefficient firm of interest. Figure 6-5 visualizes this point using a two-input and one-output

case. For the sake of simplicity, it is assumed that the DMUs generate identical outputs. The

vertical axis represents a non-discretionary input and the horizontal axis describes a

discretionary input. Figure 6-5 demonstrates that DMU E can be improved in different

directions and each direction would result in a different efficient target (e.g., E1′ ). It is the

constraints which determine the route of enhancement. If the ND slacks are set to zero, route 4

would be selected which leads to the efficient target of E4′ .


130

In this thesis, a DEA model is used to set the rules for choosing the routes and efficient

targets (see, Equations (6.15) to (6.21) below). It is assumed that the efficient frontier is

known, and hence, each DEA analysis would only incorporate the DMU of interest (i.e., E)

and the efficient DMUs (i.e., A, B, C, and D). The constraints of this new DEA model can be

set so that it leads to efficient targets with desired characteristics. As an example, the ND slack

of DMU E can be set to zero in advance. In this approach, however, the optimization would

not be deteriorated by an excessive number of constraints. The reason is that the efficient

frontier is known a priori and any hyperplane represented by the constraints would eventually

intersect the frontier.

Figure 6-5. A Demonstration of Several Potential Improvement Directions for DMU E


131

The methodology presented above assumes that the efficient frontier is known a priori. In

order to identify the efficient DMUs forming the efficient frontier, this thesis uses the MSBM

model (Sharp et al., 2007 [131]) presented above (Equations (6.8) to (6.12)). At this stage, the

analysis does not distinguish between discretionary and non-discretionary factors since this

stage pursues the sole purpose of determining the frontier. Figure 6-5 further clarifies this

point. In this simple plot, where the efficient frontier is readily observable, the discretionary

and non-discretionary factors are treated identically.

Once the efficient DMUs are detected by Model 1, the model presented in the following

page (i.e., Model 2) is used for each inefficient DMU of interest (shown by the subscript ‘𝑜𝑜’

below). The parameters 𝜋𝜋 and 𝜈𝜈 in Model 2 denote the numbers of discretionary and non-

discretionary inputs, respectively. The numbers of discretionary and non-discretionary outputs

are correspondingly displayed by 𝜉𝜉 and 𝜔𝜔. The number of DMUs used in each optimization is

reduced to 𝜅𝜅. This implies that the number of efficient DMUs identified by Model 1 is ‘𝜅𝜅 −

1’. The rest of the parameters preserve their meanings. The symbol ‘ ̂ ’ emphasizes that the

corresponding vectors have been trimmed down since only the efficient DMUs and one

inefficient DMU participate in each optimization.

Note. Model 2 is prone to be infeasible if the route of improvement dictated by the constraints

only intersects the weakly efficient hyperplanes of the efficient frontier. If for a given

inefficient DMU of interest, the optimization does not converge to a solution, the equality

signs in Equations (6.17) and (6.19) are relaxed to the corresponding non-strict inequality

signs. In this research, this change is consistent with the underlying principle of the DEA

framework which aims at setting achievable targets for an inefficient DMU since it implies

that the efficient peers operated in an environment inferior to that of the inefficient DMU.

Therefore, they can be accepted as potential targets for the inefficient DMU.


132

Model 2

min𝜓𝜓 =𝜆𝜆 � , �̂�𝑠−,�̂�𝑠+

�1 − 1𝜋𝜋∑

�̂�𝑠𝑖𝑖−

𝑅𝑅�𝑥𝑥𝑖𝑖𝑜𝑜𝜋𝜋𝑖𝑖=1 �

�1 + 1𝜉𝜉 ∑

�̂�𝑠𝑟𝑟+𝑅𝑅�𝑦𝑦𝑟𝑟𝑜𝑜

𝜉𝜉𝑟𝑟=1 �

(6.15)

Subject to 𝑥𝑥�𝑖𝑖𝑜𝑜 = �𝑥𝑥�𝑖𝑖𝑖𝑖�̂�𝜆𝑖𝑖

𝜅𝜅

𝑖𝑖=1

+ �̂�𝑠𝑖𝑖− 𝑖𝑖 ∈ D

(6.16) 𝑖𝑖 = 1, … ,𝜋𝜋

𝑥𝑥�𝑂𝑂𝑜𝑜 = �𝑥𝑥�𝑂𝑂𝑖𝑖�̂�𝜆𝑖𝑖

𝜅𝜅

𝑖𝑖=1

𝑙𝑙 ∈ ND (6.17)

See, ‘Note’ on Page 131 𝑙𝑙 = 1, … , 𝜈𝜈

𝑦𝑦�𝑟𝑟𝑜𝑜 = �𝑦𝑦�𝑟𝑟𝑖𝑖�̂�𝜆𝑖𝑖

𝜅𝜅

𝑖𝑖=1

− �̂�𝑠𝑟𝑟+ 𝑟𝑟 ∈ D

(6.18) 𝑟𝑟 = 1, … , 𝜉𝜉

𝑦𝑦�𝑢𝑢𝑜𝑜 = �𝑦𝑦�𝑢𝑢𝑖𝑖�̂�𝜆𝑖𝑖

𝜅𝜅

𝑖𝑖=1

𝑢𝑢 ∈ ND (6.19)

See, ‘Note’ on Page 131 𝑢𝑢 = 1, … ,𝜔𝜔

�𝜆𝜆𝑖𝑖

𝜅𝜅

𝑖𝑖=1

= 1 (6.20)

𝜆𝜆𝑖𝑖 ≥ 0, 𝑗𝑗 = 1, … , 𝜅𝜅

�̂�𝑠𝑖𝑖− ≥ 0, 𝑖𝑖 ∈ D and 𝑖𝑖 = 1, … ,𝜋𝜋 (6.21)

�̂�𝑠𝑟𝑟+ ≥ 0, 𝑟𝑟 ∈ D and 𝑟𝑟 = 1, … , 𝜉𝜉


133

6.2.4. Efficient IPOI Treatment

Based on the original principle suggested by Simak (2000) [133], each firm under

investigation would be compared to the following set of firms: It consists of other inefficient

firms which share identical efficient peers with the firm under investigation. The set also

includes the efficient peers themselves. As immediately becomes apparent, one of the

limitations of Simak’s (2000) [133] methodology is that it is not applicable to the cases where

the firm of interest is efficient itself (see, Section 3.3).

This problem is addressed in this thesis using an approach motivated by the slacks-based

measure of ‘super-efficiency’ in DEA (Tone, 2002 [137]; Cooper et al., 2007 [41]). Since DEA

assigns the efficiency score of ‘1’ to all efficient DMUs, the ‘super-efficiency’ method was

first proposed to rank the efficient DMUs (Andersen and Petersen, 2003 [8]). In order to gauge

the super-efficiency of an efficient DMU of interest (DMUo), this method essentially removes

DMUo from the set of DMUs and searches for a reference point P which would then perform

as the basis for measuring the super-efficiency score of DMUo. The point P is less efficient

than DMUo and lies in the new production possibility set; that is, the inputs (outputs)

associated with P are more (less) than the inputs (outputs) of DMUo. It is the objective

function and the constraints which would specify where P is desired to be located in the new

production possibility set.

Model 3, introduced in the next two pages (i.e., Equations (6.22) to (6.32)), solves a non-

linear optimization to find a point similar to P which would act as an analogue of the IPOI.

This point, referred to as U, is defined to be the closest point to IPOI which fulfills the

following conditions: (i) it shares identical non-discretionary inputs and outputs with the IPOI,

(ii) it lies on the new efficient frontier obtained after omitting the IPOI . The efficient IPOI

would then be approximated by point U and the set of comparables of the IPOI would consist

of (i) the new efficient DMUs whose weighted sum builds the point U, (ii) the inefficient

DMUs associated with these new efficient DMUs. Based on the discussion in Section 6.2.3.2,

it can be readily verified that the efficient DMUs identified by Model 3 are also efficient under

Models 1 and 2 if the IPOI is eliminated.


134

Model 3

min�̅�𝑥, 𝑦𝑦�

1 + 1𝜋𝜋∑

�̅�𝑥𝑖𝑖 − �̿�𝑥𝑖𝑖max𝑁𝑁�𝑥𝑥𝑖𝑖𝑛𝑛� − min

𝑁𝑁�𝑥𝑥𝑖𝑖𝑛𝑛�

𝜋𝜋𝑖𝑖=1

1 − 1𝜉𝜉 ∑

𝑦𝑦�𝑟𝑟 − 𝑦𝑦�𝑟𝑟min𝑁𝑁�𝑦𝑦𝑟𝑟𝑛𝑛� − max

𝑁𝑁�𝑦𝑦𝑟𝑟𝑛𝑛�

𝜉𝜉𝑟𝑟=1

(6.22)

Subject to −�̅�𝑥𝑖𝑖 + � 𝑥𝑥𝑖𝑖𝑖𝑖𝜆𝜆𝑖𝑖

𝜅𝜅−1

𝑖𝑖=1𝑖𝑖≠IPOI

≤ 0 𝑖𝑖 ∈ D

(6.23) 𝑖𝑖 = 1, … ,𝜋𝜋

�̅�𝑥𝑂𝑂 = � 𝑥𝑥𝑂𝑂𝑖𝑖𝜆𝜆𝑖𝑖

𝜅𝜅−1


𝑙𝑙 ∈ ND (6.24)

See, ‘Note’ on Page

136 𝑙𝑙 = 1, … , 𝜈𝜈

𝑦𝑦�𝑟𝑟 − � 𝑦𝑦𝑟𝑟𝑖𝑖𝜆𝜆𝑖𝑖

𝜅𝜅−1


≤ 0 𝑟𝑟 ∈ D

(6.25) 𝑟𝑟 = 1, … , 𝜉𝜉

𝑦𝑦�𝑢𝑢 = � 𝑦𝑦𝑢𝑢𝑖𝑖𝜆𝜆𝑖𝑖

𝜅𝜅−1


𝑢𝑢 ∈ ND (6.26)


136 𝑢𝑢 = 1, … ,𝜔𝜔

� 𝜆𝜆𝑖𝑖

𝜅𝜅−1


= 1 (6.27)

�̅�𝑥𝑂𝑂 = �̿�𝑥𝑂𝑂 𝑙𝑙 ∈ ND (6.28)


136 𝑙𝑙 = 1, … , 𝜈𝜈


135

𝑦𝑦�𝑢𝑢 = 𝑦𝑦�𝑢𝑢 𝑢𝑢 ∈ ND (6.29)


136 𝑢𝑢 = 1, … ,𝜔𝜔

�̿�𝑥𝑖𝑖 ≤ �̅�𝑥𝑖𝑖 ≤ max𝑁𝑁�𝑥𝑥𝑖𝑖𝑛𝑛�

𝑖𝑖 ∈ D

(6.30) 𝑖𝑖 = 1, … ,𝜋𝜋

𝑛𝑛 = 1, … ,𝑁𝑁

min𝑁𝑁

{𝑦𝑦𝑟𝑟𝑛𝑛} ≤ 𝑦𝑦�𝑟𝑟 ≤ 𝑦𝑦�𝑟𝑟

𝑟𝑟 ∈ D

(6.31) 𝑟𝑟 = 1, … , 𝜉𝜉

𝑛𝑛 = 1, … ,𝑁𝑁

𝜆𝜆𝑖𝑖 ≥ 0 𝑗𝑗 = 1, … , (𝜅𝜅 − 1)

𝑗𝑗 ≠ IPOI (6.32)

In Model 3, the point U is represented by discretionary and non-discretionary inputs and

outputs of �̅�𝑥𝑖𝑖, �̅�𝑥𝑂𝑂, 𝑦𝑦�𝑟𝑟, 𝑦𝑦�𝑢𝑢, respectively. The discretionary and non-discretionary inputs and

outputs of the IPOI are correspondingly denoted by �̿�𝑥𝑖𝑖, �̿�𝑥𝑂𝑂, 𝑦𝑦�𝑟𝑟, 𝑦𝑦�𝑢𝑢. In the list of DMUs, the

IPOI is identified by the index of ‘IPOI’. Consistent with Model 2, there exist 𝜋𝜋 discretionary

inputs, 𝑣𝑣 non-discretionary inputs, 𝜉𝜉 discretionary outputs, and 𝜔𝜔 non-discretionary outputs.

The function max𝑁𝑁

(or min𝑁𝑁

) finds the maximum (or minimum) value of the respective input or

output across all the DMUs which were initially present in the analysis; that is, 𝑁𝑁 represents

the total number of DMUs in the original set which was supplied to Model 1. It is indeed equal

to 𝐽𝐽 in Model 1 and has been replaced by a new symbol in Model 3 to make a clear distinction

between 𝑗𝑗 = 1, … , (𝜅𝜅 − 1) and 𝑛𝑛 = 1, … ,𝑁𝑁. As spelled out above, (𝜅𝜅 − 1) designates the

number of efficient DMUs identified by Model 1 and 𝑁𝑁 counts all the DMUs fed into Model 1.


136

Note. The optimization presented as Model 3 would be infeasible if the efficient IPOI resides

on the intersection of weakly and strongly efficient hyperplanes. In such a case, the equality

signs in Equations (6.24), (6.26), (6.28), and (6.29) are relaxed to the corresponding non-strict

inequality signs. The set of comparables of the efficient IPOI is then assumed to consist of the

new efficient DMUs corresponding to the point U and the inefficient DMUs associated with

these new efficient DMUs.

Beyond serving the specific features of the problem at hand, Model 3 expands the ‘super-

efficiency’ literature in DEA through satisfying all the following conditions simultaneously:

First. It is unit-invariant.

Second. It is translation-invariant and capable of handling mixes of negative and positive

values while keeping the vector containing the mix intact; that is, the proposed model does

not require segmenting a given vector of input or output into two sub-vectors of negative

and non-negative elements.

Third. It accommodates non-discretionary inputs and outputs.

Fourth. In compliance with the notion of super-efficiency, the objective function is

designed to yield a super-efficiency score of greater than unity. Furthermore, as indicated

above, in the context of super-efficiency, it is often desired to measure the super-

efficiency of an efficient DMU on the basis of a reference point which bears the closest

proximity to it (e.g., Tone, 2002 [137]). The objective function in Model 3 meets this

criterion and leads the optimization program to identify the closest reference point for the

respective efficient DMU, while preserving the properties of unit-invariance and

translation invariance.

To the best of the author’s knowledge, a model which fulfills all the aforementioned

requirements is unprecedented in the super-efficiency literature of DEA. The most intricate

challenge lies in devising the objective function. This research has developed and examined

various alternatives, none of which conforms to all the specifications listed above. Particularly,

the fourth criterion, which seeks the closest reference point, must be approached with care. As


137

a case in point, the limitation of one candidate solution is outlined below. It caters to the

characteristics of unit-invariance and translation-invariance yet fails to properly address the

fourth condition.

The next equation presents the objective function suggested by Tone (2002) [137]. The

notation has been modified for the sake of consistency with Model 3.

min 𝛿𝛿 =

1𝜋𝜋∑

�̅�𝑥𝑖𝑖�̿�𝑥𝑖𝑖

𝜋𝜋𝑖𝑖=1

1𝜉𝜉 ∑

𝑦𝑦�𝑟𝑟𝑦𝑦�𝑟𝑟

𝜉𝜉𝑟𝑟=1

(6.33) [137]

As can be observed, Tone’s (2002) [137] objective function is unit-invariant; however, it is not

translation invariant and cannot be employed in the case of negative numbers. Thus, motivated

by the principles of range directional models, Equation (6.33) is modified into the following

form:

min 𝛿𝛿 =

1𝜋𝜋∑

𝑅𝑅�̅�𝑥𝑖𝑖𝑅𝑅�̿�𝑥𝑖𝑖

𝜋𝜋𝑖𝑖=1

1𝜉𝜉 ∑

𝑅𝑅𝑦𝑦�𝑟𝑟𝑅𝑅𝑦𝑦�𝑟𝑟

𝜉𝜉𝑟𝑟=1

(6.34)

In Equation (6.34), the ‘ranges of possible improvement’ for the inputs and outputs of a given

DMU can be computed using the two equations presented below:

𝑅𝑅𝑥𝑥𝑖𝑖 = 𝑥𝑥𝑖𝑖 − min𝐽𝐽�𝑥𝑥𝑖𝑖𝑟𝑟�

(6.35)

𝑅𝑅𝑦𝑦𝑟𝑟 = max𝐽𝐽

�𝑦𝑦𝑟𝑟𝑟𝑟� − 𝑦𝑦𝑟𝑟 (6.36)


138

where, 𝑟𝑟 = 1, … , 𝜉𝜉

𝑖𝑖 = 1, … ,𝜋𝜋

𝑗𝑗 = 1, … , 𝐽𝐽

Equation (6.34) is both translation-invariant and unit-invariant. Yet, it violates the fourth

criterion posed above. As discussed earlier, the objective function is used to pick the ‘best’

case in the domain determined by the constraints. In minimizing Equation (6.34), the

optimization would tend to choose the closest inputs but the farthest outputs to the efficient

DMU of interest (or IPOI); whereas, the goal is to find the closest point to the efficient DMU

in terms of both inputs and outputs. Moreover, in Equation (6.34), 𝑅𝑅�̿�𝑥𝑖𝑖 or 𝑅𝑅𝑦𝑦�𝑟𝑟 may happen to

be zero which would render the optimization infeasible.

6.2.5. Algorithm of Phase I

The focus of all the previous sections was on developing the DEA models which suit the

specific objectives of this thesis as well as the characteristics of the inputs and outputs. This

section outlines how these models are implemented to compute the values of 𝜓𝜓 and 𝜆𝜆 for the

IPOI and its comparables.

The sector under scrutiny is assumed to include 𝑛𝑛 IPOs which took place at different

points in time. The first IPOI is then selected from this set such that the number of past IPOs is

sufficient to reliably conduct the DEA optimizations. The ID of 𝑏𝑏 is assigned to this IPOI.

Once this step is terminated, and the values of 𝜓𝜓 and 𝜆𝜆 are obtained for the IPOI and its

comparables, the analysis selects the IPO succeeding the first IPOI as the new IPOI. The same

steps are repeated to compute the values of 𝜓𝜓 and 𝜆𝜆 for the new IPOI and its comparables. The

remaining IPOs are selected sequentially in an identical manner for 𝜓𝜓 and 𝜆𝜆 calculations. The

outcomes generated in this step will later be required to calibrate the similarity metric (see,

Section 6.4). The process is visualized by the flowchart presented in Figure 6-6.


139

The DEA analysis is sensitive to the presence of outliers. They affect the shape of the

efficient frontier and, consequently, the results. In this thesis, the “Jackstrap” approach

proposed by De Sousa and Stošić (2005) [45] is adopted to detect outliers. This bootstrapping

methodology draws on a measure referred to as ‘leverage’ to identify outliers. In essence,

‘leverage’ gauges the collective change in the efficiency scores of the retained DMUs when

one DMU is removed from the analysis. The reader is referred to the original paper for the

specifics. The “Jackstrap” method can be completely automated, and Figure 6-7 and Figure

6-8 demonstrate how the approach is implemented in this thesis.

It should be noted that the analysis in this research is designed such that if the IPOI is

identified as an outlier, the algorithm exits the process of outlier removal. The analysis is then

iterated without outlier detection on account of the fact that if the IPOI is an outlier itself,

other outliers are potentially more relevant to the IPOI than non-outliers.

The steps illustrated in Figure 6-6, Figure 6-7, and Figure 6-8 are implemented in the

MATLAB programming software. The written ‘functions’ can be categorized into three

separate libraries. The first library performs data mining and all the data processing operations.

The second library implements the DEA models. The third library performs the analysis of

determining the values of 𝜓𝜓 and 𝜆𝜆 in the format presented in Figure 6-6 to Figure 6-8. A total

of approximately 7000 lines of script were written to implement the ensuing flowcharts.

Once the values of 𝜓𝜓 and 𝜆𝜆 are determined for the IPOI and its comparables, the distance

between the IPOI and each of its comparables is calculated in the new space of 𝜓𝜓 and 𝜆𝜆’s,

using the following equation:

𝐷𝐷𝑐𝑐𝑐𝑐2 = �𝜆𝜆𝑐𝑐 − 𝜆𝜆𝑐𝑐�

2+ 𝛼𝛼(𝜓𝜓𝑐𝑐 − 𝜓𝜓𝑐𝑐)2

𝑐𝑐 = 1, … , 𝜂𝜂 (6.37)


140

where, 𝐷𝐷𝑐𝑐𝑐𝑐 refers to the distance between the IPOI under scrutiny, identified by the ID of 𝑏𝑏,

and the comparable firm 𝑐𝑐. The number of comparables of the IPOI is denoted by 𝜂𝜂. The

parameter 𝛼𝛼 is a constant determined by calibrating the model (see, Section 6.4). Each term in

Equation (6.37) is an equivalent of the ‘Euclidean distance’ in a transformed space. The

distance 𝐷𝐷 is indeed a measure of the similarity between the IPOI and its individual

comparables. The most similar firm (or the closest comparable) would, therefore, be the one

which has the minimum distance from the IPOI in the transformed space.


141

Figure 6-6. This flowchart presents the sequence of steps involved in Phase I of the analysis. The MATLAB

programming software is used to implement all the steps. The written ‘functions’ are classified in three different

libraries: (i) Data Mining and Data Processing, (ii) DEA Models, (iii) Analysis. In total, the libraries comprise

approximately 7000 lines of script.


142

Figure 6-6. Continued.


143

Figure 6-7. The outlier detection algorithm of De Sousa and Stošić (2005) [45] is implemented as

demonstrated by this flowchart.


144

Figure 6-7. Continued.


145

Figure 6-8. This flowchart outlines the process of computing ‘threshold’, indicated in Figure 6-7.

Chapter 6. Methodology 6.3. Phase II: Assessment of Short-Term Risk

146

6.3. Phase II: Assessment of Short-Term Risk

The ultimate objective of this thesis is to quantify the short-term risk of investing in IPOs.

Here, ‘risk’ is defined as a ‘value at risk’ (VAR) measure determined based on PDF90IPOI. This

abbreviation stands for the distribution of the IPOI’s stock price at the time point of 90 trading

days subsequent to the issue day. This section outlines the steps involved in estimating

PDF90IPOI.

6.3.1. Stock Pricing Model

The model of geometric Brownian motion (GBM) is selected to estimate PDF90IPOI. There

exist other mathematically intricate models that can be used to model stock prices. Abidin and

Jaffar (2012) [2] provide a comprehensive list of such models. Two examples from their list

would be ‘clustering-genetic fuzzy system’ (CGFS) (Hadavandi et al., 2010 [63]) and ‘Markov–

Fourier grey model’ (MFGM) (Hsu et al., 2009 [72]). The study indicates that such models are

only suitable for predicting long-term prices or the price of the ‘next day’.

The necessity of complex models in projecting long-term trends is not exclusive to stock

prices. Another example would be interest rate estimation. This framework is also

theoretically rigorous since interest rate-based derivatives, such as bonds and interest rate

swaps, are often long-term. Thus, the respective models are required to incorporate factors

such as long-term average and rate of mean reversion, all of which contribute to the

complexity of the framework.

GBM is deemed a reasonable model for finding the short-term evolution of stock prices.

The most well-known and prevalently-used option pricing formula – the Black-Scholes-

Merton formula (Black and Scholes, 1973 [27]; Merton, 1973 [98]) - draws upon GBM to model

stock prices. Considering that options are short-term investments, it can be inferred that GBM

is a suitable candidate to predict short-term stock prices. Perhaps the most important motive

for choosing GBM is its prevalent use among investors by virtue of its simplicity (Abidin and

Jaffar (2012) [2]). The outcomes of this model can potentially impact the stock trends since the


147

investors tend to base their analyses of future prices on GBM. This practice can consequently

influence their trading strategies as well as the types and amounts of the orders placed by

them.

To further justify the use of GBM in the context of this research, it should be noted that it is

advantageous in the following two respects:

(i) The assessment of risk draws on a short-term post-IPO history, and it is discussed

above that GBM is befitting in such cases.

(ii) The 90-day time window used by the methodology starts from the first trading day

which implies that no history exists prior to the start of this window. Thus, factors

such as long-term average or rate of mean reversion are not applicable in this case.

Considering that such factors would pose no concern in this problem, simpler

models can be adopted to describe short-term trends, and GBM is selected to serve

this purpose.

The stochastic process of price (𝑉𝑉) is said to follow GBM if it satisfies the following

stochastic differential equation:

d𝑉𝑉𝑡𝑡 = 𝜇𝜇𝑉𝑉𝑡𝑡d𝑡𝑡 + 𝜎𝜎𝑉𝑉𝑡𝑡d𝑍𝑍𝑡𝑡 (6.38)

In Equation (6.38), the parameter 𝜇𝜇 is a constant representing the expected rate of return for

the stock. The expected drift rate in 𝑉𝑉 can then be expressed as 𝜇𝜇𝑉𝑉. The volatility of the stock

price is denoted by the constant 𝜎𝜎. Note that the parameters 𝜇𝜇 and 𝜎𝜎 are draws from the

random variables Μ and Σ, respectively. The variable 𝑍𝑍 follows a Wiener process, and the

change in 𝑍𝑍 during an infinitesimally small time period (d𝑡𝑡) can be shown by the following

equation:


148

d𝑍𝑍𝑡𝑡 = 𝜖𝜖√d𝑡𝑡 (6.39)

where, the parameter 𝜖𝜖 represents a random variable which has a standard normal distribution

N(0,1).

6.3.2. Distribution of Stock Price 90 Days after the Issuing Day

At the onset of this section it was indicated that this research pursues the specific goal of

finding the distribution of the IPOI’s price at the time point of 90 trading days after the issuing

day. This price, henceforth, is referred to as 𝑉𝑉90. In order to estimate the distribution of 𝑉𝑉90,

one of the required pieces of information is the starting point of the stochastic process of price;

that is, the closing price of the first trading day (𝑉𝑉1). Prior to delving into the details of finding

the distribution of 𝑉𝑉90, the subsequent three paragraphs explain the conjecture on the possible

distribution of 𝑉𝑉1.

As will be discussed in Chapter 7, the data of the ‘Health Care’ sector (Sector 35 under

GICS) are used to examine the proposed methodology in this thesis. For all the 327 IPOs,

which span the time period of 1990-2012, the histogram of the following ratio is plotted in

Figure 6-9: closing price of the first day to IPO offer price. The histogram closely resembles a

log-normal distribution.

Note that the mode of approximately ‘1’ in Figure 6-9 could have been caused by

underwriters’ price stabilizing activities. If the IPOI is a ‘broke issue’ and the underwriters

have been granted the ‘overallotment’ option, they can limit the stock supply in the market by

buying back some of the ‘overalloted’ shares. This practice would prevent or delay the fall of

prices below the offer price (see, Section 2.5.2) and would lead to a first-day closing price in

the vicinity of the offer price. It can, therefore, be concluded that the closing price of the first

day would not often reflect the true degree of overpricing. If the stock price is overpriced with

respect to its short-term performance, this property is expected to be captured by a negative

drift in GBM.


149

Based on the observation depicted in Figure 6-9, it is assumed that the ratio of closing price of

the first trading day (𝑉𝑉1) to offer price (𝑉𝑉offer) is a random variable which follows a log-normal

distribution with the corresponding normal distribution of N(𝜉𝜉,𝜔𝜔2). Consequently, the

distribution of 𝑉𝑉1 for each IPO can be assumed to be log-normal since the offer price is known

a priori for the corresponding IPO. Note that the parameters 𝜉𝜉 and 𝜔𝜔 are themselves assumed

to be draws (realizations) from the random variables of Ξ and Ω, respectively. The parameter

𝑣𝑣1 denotes a realization from the log-normal distribution of the random variable 𝑉𝑉1.

As established, the price is assumed to follow a GBM. The only unknown parameters of the

GBM are 𝜇𝜇, 𝜎𝜎 and 𝑣𝑣1. If these parameters are known, the conditional PDF of 𝑉𝑉90 given 𝜇𝜇, 𝜎𝜎

Figure 6-9. The figure presents the histogram of the ratio of ‘closing price of the first day to offer price’

for 327 IPOs in the ‘health care’ sector. The IPOs span the time period of 1990-2012.


150

and 𝑣𝑣1 (i.e., ℒ𝑉𝑉90(𝑣𝑣90|Μ ≈ 𝜇𝜇, Σ ≈ 𝜎𝜎,𝑉𝑉1 ≈ 𝑣𝑣1)) can be obtained as explained below. The

approximate equality Μ ≈ 𝜇𝜇 can be more precisely expressed as 𝜇𝜇 < Μ ≤ 𝜇𝜇 + d𝜇𝜇. The

approximate equalities of Σ ≈ 𝜎𝜎 and 𝑉𝑉1 ≈ 𝑣𝑣1 can be similarly interpreted.

For arbitrary values of 𝜇𝜇, 𝜎𝜎 and 𝑣𝑣1, where 𝑣𝑣1 denotes the starting point of the stochastic

process at the time 𝑡𝑡 = 1, the analytic solution of the stochastic differential equation of GBM

is presented below:

𝑉𝑉𝑡𝑡𝑣𝑣1

= exp��𝜇𝜇 −𝜎𝜎2

2� 𝑡𝑡 + 𝜎𝜎𝑍𝑍𝑡𝑡� (6.40)

→ 𝑉𝑉𝑡𝑡 = exp �ln 𝑣𝑣1 + �𝜇𝜇 −𝜎𝜎2

2� 𝑡𝑡 + 𝜎𝜎𝑍𝑍𝑡𝑡�

Equation (6.40) implies that (ln𝑉𝑉𝑡𝑡) is a draw from the normal distribution N �ln𝑣𝑣1 +

�𝜇𝜇 − 𝜎𝜎2

2� 𝑡𝑡,𝜎𝜎2𝑡𝑡�. The variable 𝑉𝑉𝑡𝑡 is, thus, log-normally distributed. The log-normal

distribution of 𝑉𝑉𝑡𝑡 can be expressed as follows:

ℒ𝑉𝑉𝑡𝑡(𝑣𝑣𝑡𝑡|Μ ≈ 𝜇𝜇, Σ ≈ 𝜎𝜎,𝑉𝑉1 ≈ 𝑣𝑣1, 𝑡𝑡) (6.41)

=1

√2𝜋𝜋 𝜎𝜎√𝑡𝑡𝑣𝑣𝑡𝑡exp

⎝

⎜⎛− �ln 𝑣𝑣𝑡𝑡𝑣𝑣1

− �𝜇𝜇 − 𝜎𝜎22 � 𝑡𝑡�

2

2𝜎𝜎2𝑡𝑡

⎠

⎟⎞

Given the parameters 𝜇𝜇, 𝜎𝜎, and 𝑣𝑣1, the distribution of 𝑉𝑉90 can be estimated using Equation

(6.41). However, these three parameters are not deterministic and represent realizations from


151

the random variables Μ, Σ, and 𝑉𝑉1. Particularly, 𝑉𝑉1 is a draw from a distribution whose

parameters are themselves random. As discussed above, the dynamics of the ratio of first

closing price to offer price is best described by a log-normal distribution whose two unknown

parameters (𝜉𝜉 and 𝜔𝜔) denote realizations from the random variables Ξ and Ω, respectively.

Considering that Ξ and Ω could potentially be correlated with Μ and Σ, their distributions must

be considered jointly in order to find the distribution of 𝑉𝑉90. The function 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔)

designates this joint distribution and is employed to derive the distribution of 𝑉𝑉90 (i.e.,

PDF90IPOI) as demonstrated by the equation established below:

𝑓𝑓𝑉𝑉90(𝑣𝑣90) (6.42)

= � � � � � � ℒ𝑉𝑉90(𝑣𝑣90|Μ ≈ 𝜇𝜇, Σ ≈ 𝜎𝜎,𝑉𝑉1 ≈ 𝑣𝑣1) 𝑓𝑓V1(𝑣𝑣1|𝜉𝜉,𝜔𝜔)d𝑣𝑣1

+∞

𝑣𝑣1=−∞

�+∞

𝜇𝜇=−∞

+∞

𝜎𝜎=−∞

+∞

𝜉𝜉=−∞

+∞

𝜔𝜔=−∞

× 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) d𝜇𝜇d𝜎𝜎d𝜉𝜉d𝜔𝜔

where, 𝑓𝑓𝑉𝑉90(𝑣𝑣90) symbolizes PDF90IPOI. Note that ℒ𝑉𝑉90 denotes the conditional distribution of

the random variable 𝑉𝑉90, given the event of (Μ ≈ 𝜇𝜇, Σ ≈ 𝜎𝜎,𝑉𝑉1 ≈ 𝑣𝑣1); whereas, 𝑓𝑓𝑉𝑉90(𝑣𝑣90)

represents the unconditional distribution of 𝑉𝑉90. It is computed by integrating (averaging) over

all values of the underlying parameters and, hence, bears no dependency on any of them.

Through the integration process, Equation (6.42) indeed accounts for the ‘epistemic’

uncertainty associated with 𝑉𝑉90 (see, Section 5.4, Page 98). The only untackled component of

Equation (6.42) remains 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) and the next section (Section 6.3.2.1) is focused on

outlining the approach suggested for its estimation.

6.3.2.1. Estimating the Joint Distribution of the Parameters

Prior to delineating the method used to estimate 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔), it should be noted that

the IPOs which issued prior to the 1st IPOI are, henceforth, referred to as ‘pioneers’. Pioneers


152

can be considered as the starting ‘pool of candidates’ (see, Section 6.2.1); no IPO from this list

is scrutinized as an IPOI. For further clarification on the terminology, consider Figure 6-10.

This figure depicts the ‘pool of candidates’ for each IPOI using the data of the ‘health care’

sector. To ensure the reliability of the outcomes of Phase I, which is developed in the DEA

framework, the first 51 IPOs are reserved as the starting pool of candidates (i.e., the pioneer

group).

Estimating 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) involves two main steps: (i) estimating a prior distribution for

𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔), (ii) estimating a posterior distribution for 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) through

updating the prior distribution using the Bayesian framework. The two steps are detailed in the

next subsections (i.e., Sections 6.3.2.1.1 and 6.3.2.1.2).

6.3.2.1.1. Estimating the Posterior Distributions

In order to estimate 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔), first it is assumed that there exists a prior

distribution for 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) which is known. It is explained later how this prior

distribution is estimated for individual IPOs (see, Section 6.3.2.1.2). The Bayesian framework

is subsequently employed to update the prior distribution of each IPO. The updating process is

carried out using the 90-day realized trajectory of stock prices for the respective IPO. The

starting point of each trajectory is the closing price of the first trading day (𝑣𝑣1), and the

frequency of data is daily. Below, it is demonstrated mathematically how to update the prior

distribution:

𝑓𝑓Μ,Σ,Ξ,Ωpost𝑔𝑔 �𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔|𝑢𝑢, ln

𝑣𝑣2𝑣𝑣1

, ln𝑣𝑣3𝑣𝑣2

, … , ln𝑣𝑣𝑛𝑛−2𝑣𝑣𝑛𝑛−3

, ln𝑣𝑣𝑛𝑛−1𝑣𝑣𝑛𝑛−2

, ln𝑣𝑣𝑛𝑛𝑣𝑣𝑛𝑛−1

� (6.43)

=P �𝑢𝑢, ln𝑣𝑣2𝑣𝑣1

, ln𝑣𝑣3𝑣𝑣2, … , ln𝑣𝑣𝑛𝑛−2𝑣𝑣𝑛𝑛−3

, ln𝑣𝑣𝑛𝑛−1𝑣𝑣𝑛𝑛−2, ln 𝑣𝑣𝑛𝑛

𝑣𝑣𝑛𝑛−1�𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔�𝑓𝑓Μ,Σ,Ξ,Ω

prior𝑔𝑔 (𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔)

∫ ∫ ∫ ∫ P �𝑢𝑢, ln𝑣𝑣2𝑣𝑣1, … , ln 𝑣𝑣𝑛𝑛−1𝑣𝑣𝑛𝑛−2

, ln 𝑣𝑣𝑛𝑛𝑣𝑣𝑛𝑛−1

��̅�𝜇,𝜎𝜎�, 𝜉𝜉̅,𝜔𝜔��+∞𝜇𝜇�=−∞

+∞𝜎𝜎�=−∞

+∞𝜉𝜉�=−∞

+∞𝜔𝜔�=−∞ 𝑓𝑓Μ,Σ,Ξ,Ω

Prior𝑔𝑔��̅�𝜇,𝜎𝜎�, 𝜉𝜉̅,𝜔𝜔��d�̅�𝜇d𝜎𝜎�d𝜉𝜉̅d𝜔𝜔�


153

1 1 1 1

1 1

2 2 2 2 2 2

3 3 3 3 3 3

⋮ ⋮ ⋮ ⋮ ⋮ ⋮

50 50 50 50 50 50

51 51 51 51 51 51

52

(IPOI1) 52 52 52 52 52

53

(IPOI2) 53 53 53 53

54

(IPOI3) 54 54 54

55

(IPOI4) 55 55

⋱ 56

56

57 57

⋮ ⋮

323 323

324 324

325 325

326

(IPOI275) 326

327 (IPOI276)

Pool

of C

andi

date

s for

IPO

I 1

Figure 6-10. Using the data of the ‘health care’ sector (Sector 35 under GICS), this figure provides an

overview of the IPOIs and their ‘pools of candidates’. Each column presents the IPOs used to carry out

Phase I and Phase II for the respective IPOI. The reader is referred to Section 6.2.1 for further

information on the pool of candidates. The first 51 IPOs are referred to as ‘pioneers’.

Pool

of C

andi

date

s for

IPO

I 2

Pool

of C

andi

date

s for

IPO

I 3

Pool

of C

andi

date

s for

IPO

I 4


154

In Equation (6.43), the variable 𝑢𝑢 stands for the ratio of closing price of the first trading day

(𝑣𝑣1) to offer price (𝑣𝑣offer). The subscript 𝑔𝑔 identifies the IPO under scrutiny and indicates that

the respective variable/distribution is ascribed to the 𝑔𝑔th IPO. The realized trajectory of the 𝑔𝑔th

IPO is a vector of size 𝑛𝑛. The function 𝑓𝑓Μ,Σ,Ξ,Ωprior𝑔𝑔(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) represents the prior form of

𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) for the 𝑔𝑔th IPO, and Equation (6.43) solves for the respective posterior

distribution which is denoted by 𝑓𝑓Μ,Σ,Ξ,Ωpost𝑔𝑔 �𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔|𝑢𝑢, ln 𝑣𝑣2

𝑣𝑣1, … , ln 𝑣𝑣𝑛𝑛

𝑣𝑣𝑛𝑛−1�.

Since 𝑓𝑓Μ,Σ,Ξ,Ωprior𝑔𝑔(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) is assumed to be known at this stage, the only unknown in

Equation (6.43) is the probability P �𝑢𝑢, ln 𝑣𝑣2𝑣𝑣1

, ln 𝑣𝑣3𝑣𝑣2

, … , ln 𝑣𝑣𝑛𝑛−2𝑣𝑣𝑛𝑛−3

, ln 𝑣𝑣𝑛𝑛−1𝑣𝑣𝑛𝑛−2


�𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔� which

can be calculated as detailed next.

For a given combination of 𝜇𝜇, 𝜎𝜎 and two consecutive observed prices (i.e., 𝑣𝑣𝑖𝑖 and 𝑣𝑣𝑖𝑖+1,

where 𝑖𝑖 = 1, … , (𝑛𝑛 − 1)), the following equation can be inferred from the analytic solution of

GBM, shown by Equation (6.40):

𝜖𝜖𝑖𝑖 =ln𝑣𝑣𝑖𝑖+1𝑣𝑣𝑖𝑖

− �𝜇𝜇 − 𝜎𝜎22 � Δ𝑡𝑡

𝜎𝜎√Δ𝑡𝑡 (6.44)

𝑖𝑖 = 1, … , (𝑛𝑛 − 1)

Equation (6.44) indicates that if 𝑣𝑣𝑖𝑖, 𝑣𝑣𝑖𝑖+1, 𝜇𝜇 and 𝜎𝜎 are given, the value of 𝜖𝜖𝑖𝑖 can be determined

recursively by substituting 𝑣𝑣𝑖𝑖, 𝑣𝑣𝑖𝑖+1, 𝜇𝜇 and 𝜎𝜎 back into the analytic solution of GBM. The

variable 𝜖𝜖𝑖𝑖 indeed represents the value of the 𝑖𝑖th random draw from the standard normal

distribution which, in conjunction with 𝑣𝑣𝑖𝑖, 𝜇𝜇 and 𝜎𝜎, led to the observed price of 𝑣𝑣𝑖𝑖+1.


155

Based on Equation (6.44), it can be concluded that the probability

P �𝑢𝑢, ln 𝑣𝑣2𝑣𝑣1

, ln 𝑣𝑣3𝑣𝑣2

, … , ln 𝑣𝑣𝑛𝑛−2𝑣𝑣𝑛𝑛−3

, ln 𝑣𝑣𝑛𝑛−1𝑣𝑣𝑛𝑛−2


�𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔�, which appears in Equation (6.43), can

be computed as follows:

P �𝑢𝑢, ln𝑣𝑣2𝑣𝑣1

, ln𝑣𝑣3𝑣𝑣2

, … , ln𝑣𝑣𝑛𝑛−2𝑣𝑣𝑛𝑛−3

, ln𝑣𝑣𝑛𝑛−1𝑣𝑣𝑛𝑛−2

, ln𝑣𝑣𝑛𝑛𝑣𝑣𝑛𝑛−1

�𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔� (6.45)

= P(𝑢𝑢, 𝜖𝜖1, 𝜖𝜖2, … , 𝜖𝜖𝑛𝑛−2, 𝜖𝜖𝑛𝑛−1|𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔)

= P(𝑢𝑢|𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) × P(𝜖𝜖1|𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) × … × P(𝜖𝜖𝑛𝑛−1|𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔)

= 𝑓𝑓𝑈𝑈logN(𝑢𝑢; 𝜉𝜉,𝜔𝜔)d𝑢𝑢 × 𝑓𝑓ΕN(𝜖𝜖1; 0,1)d𝜖𝜖 × … × 𝑓𝑓ΕN(𝜖𝜖𝑛𝑛−1; 0,1)d𝜖𝜖

= 1

𝜔𝜔𝑢𝑢√2𝜋𝜋exp�−

(ln𝑢𝑢 − 𝜉𝜉)2

2𝜔𝜔2 �d𝑢𝑢 ×1

(2𝜋𝜋)(𝑛𝑛−1)/2 exp�− 12�𝜖𝜖𝑖𝑖2𝑛𝑛−1

𝑖𝑖=1

�d𝜖𝜖𝑛𝑛−1

where, the probability P(𝑢𝑢, 𝜖𝜖1, 𝜖𝜖2, 𝜖𝜖3, … , 𝜖𝜖𝑛𝑛−2, 𝜖𝜖𝑛𝑛−1|𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) is calculated as the product of

the probabilities of the realized points since they are independent from each other. The

computation of each term in Equation (6.45) is straightforward: Each element of the (𝑛𝑛 − 1)-

tuple of (𝜖𝜖1, 𝜖𝜖2, 𝜖𝜖3, … , 𝜖𝜖𝑛𝑛−2, 𝜖𝜖𝑛𝑛−1) is a sample from the standard normal distribution whose

PDF is denoted by 𝑓𝑓ΕN(𝜖𝜖; 0,1). With regard to 𝑢𝑢, as explained earlier, it is derived from a log-

normal distribution, represented by 𝑓𝑓𝑈𝑈logN(𝑢𝑢; 𝜉𝜉,𝜔𝜔), in Equation (6.45).

6.3.2.1.2. Estimating the Prior Distributions

In the case of the ‘pioneers’ (i.e., the first 51 IPOs in Figure 6-10), the prior distribution

required to solve Equation (6.43) is assumed to be ‘uniform’ – drawing on the notion of ‘non-

informative’ priors detailed in Section 5.3. Note that for each IPO, this distribution is

subsequently updated based on the respective IPO’s realized market data. Hence, the final

outcomes are not significantly sensitive to the primary form conjectured for the distribution. In


156

order to estimate the prior distribution for the IPOs which occur subsequent to the ‘pioneers’,

the following method is used:

Starting from the 1st IPOI, its prior distribution is approximated by the posterior distribution

of its closest comparable. The closest comparable is defined to be a preceding IPO which has

the minimum distance from the respective IPOI in the 𝜆𝜆 and 𝜓𝜓 space (see, Section 6.2.5). As

presented by Equation (6.37) (repeated below), the distance between the IPOI under scrutiny,

identified by the ID of 𝑏𝑏, and its comparable 𝑐𝑐 can be quantified as follows:

𝐷𝐷𝑐𝑐𝑐𝑐2 = �𝜆𝜆𝑐𝑐 − 𝜆𝜆𝑐𝑐�2

+ 𝛼𝛼(𝜓𝜓𝑐𝑐 − 𝜓𝜓𝑐𝑐)2 (6.37)

Example: Consider the 1st IPOI (i.e., the 52nd IPO). The set of comparables to this IPOI

consists of the following IPOs: [8, 16, 20, 46]. The 20th IPO exhibits the minimum distance

from the 1st IPOI; that is, 𝐷𝐷20,52 is the minimum of all 𝐷𝐷𝑐𝑐,52’s. Thus, the prior distribution of

the 1st IPOI (i.e., 𝑓𝑓Μ,Σ,Ξ,Ωprior52(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔)) is assumed to be equal to the posterior distribution of the

20th IPO (i.e., 𝑓𝑓Μ,Σ,Ξ,Ωpost20 �𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔|𝑢𝑢20, ln 𝑣𝑣20,2

𝑣𝑣20,1, ln 𝑣𝑣20,3

𝑣𝑣20,2, … , ln 𝑣𝑣20,𝑛𝑛−2

𝑣𝑣20,𝑛𝑛−3, ln 𝑣𝑣20,𝑛𝑛−1

𝑣𝑣20,𝑛𝑛−2, ln 𝑣𝑣20,𝑛𝑛

𝑣𝑣20,𝑛𝑛−1�).

Once the 1st IPOI (i.e., the 52nd IPO) is realized, its prior distribution is updated based on

the respective stock prices observed in the market. The estimated posterior distribution of the

52nd IPO is then recorded to be used when it is selected as the closest comparable to a

subsequent IPOI. In that event, the posterior distribution of the 52nd IPO would act as the prior

distribution for the corresponding IPOI. The approach outlined above for finding the prior and

posterior distributions of the 1st IPOI can similarly be applied to any IPOI which issues after

the 1st IPOI.

Chapter 6. Methodology 6.4. Calibrating the Distance Equation

157

6.4. Calibrating the Distance Equation

In order to estimate the distance between a given IPOI and its comparables, the following

equation was suggested in Phase I:



𝑐𝑐 = 1, … , 𝜂𝜂 (6.37)

where, the subscript 𝑐𝑐 identifies the respective comparable and 𝜂𝜂 stands for the total number of

comparables for the corresponding IPOI which is assigned the ID of 𝑏𝑏. This section is focused

on outlining the approach proposed to estimate the optimum value of 𝛼𝛼. The calibration

process seeks to find a model (i.e., 𝛼𝛼) under which the likelihood of the observed prices (i.e.,

𝑉𝑉90’s) is maximum.

First, 𝛼𝛼 is assigned a value. Then, the proposed methodology in Phase II is used to obtain

the distribution of each IPOI’s stock price at the time point of 90 days after the issue day (i.e.,

𝑓𝑓𝑉𝑉90(𝑣𝑣90)). Subsequently, the probability of each IPOI’s realized price at day 90 is computed

using its 𝑓𝑓𝑉𝑉90(𝑣𝑣90). Considering the ‘health care’ sector as an example (see, Figure 6-10), the

above steps imply that there exist a total of 276 distributions (i.e., 𝑓𝑓𝑉𝑉90(𝑣𝑣90)’s) under a given

𝛼𝛼. Consequently, there exist 276 numbers, each specifying the probability of the realized 𝑣𝑣90

of the respective IPOI using its specific 𝑓𝑓𝑉𝑉90(𝑣𝑣90).

Second, the likelihood of the realized prices under the assumed 𝛼𝛼 is computed as shown

below:

P�𝛼𝛼�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90

IPOI1 ,𝑉𝑉90IPOI2 ≈ 𝑣𝑣90

IPOI2 , … ,𝑉𝑉90IPOIG ≈ 𝑣𝑣90

IPOIG� (6.46)

=P�𝑉𝑉90

IPOI1 ≈ 𝑣𝑣90IPOI1 ,𝑉𝑉90

IPOI2 ≈ 𝑣𝑣90IPOI2 , … ,𝑉𝑉90

IPOIG ≈ 𝑣𝑣90IPOIG�𝛼𝛼�P(𝛼𝛼)

P�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90



IPOIG�


158

In this equation, G refers to the total number of IPOIs. It is equal to 276 in the case of the

‘health care’ sector. The approximate equality of 𝑉𝑉90IPOI1 ≈ 𝑣𝑣90

IPOI1 abbreviates the following

inequality: 𝑣𝑣90IPOI1 < 𝑉𝑉90

IPOI1 ≤ 𝑣𝑣90IPOI1 + d𝑣𝑣90.

The steps indicated above are iterated using different values of 𝛼𝛼. It is assumed that all the

values of 𝛼𝛼 are equally likely and the variable 𝛼𝛼 follows a uniform distribution which implies

that P(𝛼𝛼) is a constant. Therefore, maximizing Equation (6.46) translates into maximizing the

probability P�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90



IPOIG|𝛼𝛼� which expresses the

likelihood of 𝛼𝛼. Therefore, the optimum 𝛼𝛼 for the distance function (i.e., Equation (6.37)) is

the 𝛼𝛼 whose likelihood is maximum, given the observed values of 𝑉𝑉90 across all the IPOIs.

Given 𝛼𝛼, the following equation always holds for the probability




IPOIG|𝛼𝛼�:




IPOIG|𝛼𝛼� (6.47)

= P�𝑉𝑉90IPOIG ≈ 𝑣𝑣90

IPOIG�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90

IPOI1 , … ,𝑉𝑉90IPOIG−1 ≈ 𝑣𝑣90

IPOIG−1 ,𝛼𝛼�

× P�𝑉𝑉90IPOIG−1 ≈ 𝑣𝑣90

IPOIG−1�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90



× P�𝑉𝑉90IPOIG−2 ≈ 𝑣𝑣90

IPOIG−2�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90



⋮

× P�𝑉𝑉90IPOI3 ≈ 𝑣𝑣90

IPOI3�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90


IPOI2 ,𝛼𝛼�


IPOI2�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90

IPOI1 ,𝛼𝛼�


IPOI1�𝛼𝛼)


159

It has now become clear why the probability of the realized 𝑣𝑣90 for each IPOI was

computed above; each probability term on the RHS of Equation (6.47) indeed represents the

probability of the realized 𝑣𝑣90 for the respective IPOI. The following example further clarifies

the concept. The probability of the realized 𝑣𝑣90 for the 2nd IPOI, calculated based on

𝑓𝑓𝑉𝑉90IPOI2(𝑣𝑣90), is indeed equal to P�𝑉𝑉90

IPOI2 ≈ 𝑣𝑣90IPOI2�𝑉𝑉90

IPOI1 ≈ 𝑣𝑣90IPOI1 ,𝛼𝛼�. The reason is that when

the 2nd IPOI is scrutinized, the first IPOI has already taken place and 𝑣𝑣90IPOI1 has been observed.

As a conclusion to this section, the flowchart in Figure 6-11 provides an overview of the

calibration process.


160

Figure 6-11. This flowchart provides an overview of the process of calibrating the ‘distance function’ and

estimating the optimum value for 𝛼𝛼 (see, Equation (6.37)).

161

Chapter 7. DATA

7.1. Fundamental Financial Data

As indicated in Chapter 6, the pre-IPO fundamental financial data are required to conduct

the analysis. For a given firm, at least one pre-IPO audited annual data point is required for

each input and output. After scrutinizing the capabilities of various search engines and the

availability of pre-IPO data in each, the following sources were selected to collect data:

(i) Bloomberg Terminal

(ii) Wharton Research Data Services (WRDS)

(iii) Securities Data Company PlatinumTM (SDC PlatinumTM)

The Bloomberg database offers a comprehensive custom search for the IPOs in a specific

industry and/or during a particular time period. In Bloomberg, however, limited pre-IPO

financial data are available for the list of IPOs returned by the search. Bloomberg has started

the scrutiny of prospectuses since 2011. The lists of U.S. IPOs were downloaded in 10

categories each representing one Bloomberg industry. These lists required pruning from

spinoffs/carve-outs, shelf registrations and non-original IPOs. The ‘new issues’ database of

SDC PlatinumTM was used for this purpose. The 9-digit CUSIPs of the Bloomberg lists were

fed into the SDC PlatinumTM and the SDC matches were retrieved. Spinoffs/carve-outs, shelf

registrations and non-original IPOs were then excluded from the lists of returned matches. In

addition, the SDC PlatinumTM was used to limit the lists to the U.S. IPOs whose country of

domicile is also U.S.

SDC PlatinumTM consists of different databases which provide detailed information on new

issues, mergers and acquisitions, private equity, and more. The data on IPOs and secondary

offerings of common stocks are limited to the cases with the ‘book-building’ mechanism

which is dominant in the U.S. (see, Section 2.4 for the details of the mechanism). SDC is

Chapter 7. Data 7.1. Fundamental Financial Data

162

updated daily. Some elements may include data from 1970 to present; however, for many

elements, the data are reliably available for the last ten to fifteen years.

Using the sources such as SEC filings and prospectuses, the ‘new issues’ database of SDC

provides the following variables and many more: offer price, aftermarket prices, some of the

elements from the financial statements, numbers of primary and secondary shares offered. The

database also includes information on the underwriters’ and issuers’ profiles such as industry

and location.

In order to retrieve the respective pre-IPO audited annual data, the CUSIP lists returned

from the SDC PlatinumTM were subsequently supplied into the COMPUSTAT database of

Wharton Research Data Services (WRDS). As indicated earlier, the sector-specific indices are

required for the risk analysis in this thesis. Up to this stage, the Bloomberg sector

classification was preserved, and the CUSIPs were clustered in 10 Bloomberg sectors. Due to

limited sector-specific index data on Bloomberg sectors, the macro list of all the CUSIPs was

re-classified into 10 S&P sectors, using the Global Industry Classification Standard (GICS)

codes.

Compared to other databases such as FACTSET, S&P Capital IQ, and Thomson Reuters

Financial Database, the COMPUSTAT database of WRDS is more comprehensive in terms of

inclusion of pre-IPO data; however, there still exists a considerable number of firms with no

pre-IPO information. These firms were identified and eliminated.

Table 7-1 presents the number of firms retained under each S&P sector once all the

modifications and configurations were implemented. The numbers in the second column of

Table 7-1 represent the case where the computer program examines the availability of only

one pre-IPO audited annual datum for each input and output. It should, however, be noted that

in case of balance sheet items, two pre-IPO data points are required in order to calculate the

‘period’ changes for the respective variables. This extra criterion further trims the number of

firms. The fourth column in Table 7-1 contains these numbers.

Chapter 7. Data 7.1. Fundamental Financial Data

163

Table 7-1. The table presents the number of remaining IPOs under each S&P sector after completing the process

of data mining. The third and fifth columns display the time periods spanned by the respective IPOs.

GICCD (S&P Sector)

No. of IPOs

(1 pre-IPO

data point)

Temporal

Range

(1 pre-IPO

data point)

No. of IPOs

(2 pre-IPO

data points)

Temporal

Range

(2 pre-IPO

data points)

55 (Utilities) 7 1990-1998 1 1993

50 (Telecommunication

Services) 61 1990-2010 35 1993-2010

15 (Materials) 80 1988-2012 36 1991-2012

10 (Energy) 88 1988-2011 52 1993-2011

30 (Consumer Staples) 96 1988-2013 42 1991-2013

40 (Financials) 187 1989-2012 111 1992-2012

20 (Industrials) 363 1988-1992 162 1990-2012

25 (Consumer

Discretionary) 560 1988-2013 208 1991-2012

35 (Health Care) 604 1988-2012 327 1990-2012

45 (Information

Technology) 1017 1987-2013 501 1990-2013

Chapter 7. Data 7.2. Sector Index and GDP

164

7.2. Sector Index and GDP

Four sector-specific indices were retrieved for each S&P sector: (i) S&P 1500 Composite,

(ii) S&P 400, (iii) S&P 500, (iv) S&P 600. The index values were provided at a minimum

frequency of monthly in the COMPUSTAT database of WRDS. The geometric mean was used

in order to convert the monthly periodicity to annual frequency.

Among the four indices of each sector, S&P 1500 Composite would serve as the most

representative index since each sector comprises firms of diverse market capitalizations.

However, as demonstrated in Table 7-2, the time periods spanned by S&P 1500 Composite do

not fully cover the periods shown in Table 7-1. The widest ranges of data are available for the

S&P 500 index. Therefore, the analysis is based on this class of indices.

There exists an extensive list of firms from different sectors and of different types that can

be utilized to examine the proposed models. IPOs can be categorized into two broad classes:

IPOs in mature industries and IPOs in growing industries. The characteristics of these two

classes should be considered in selecting an appropriate industry/sector for the analysis. The

IPOs in mature industries suggest a robust setting to test the models; the performance of the

proposed methodology can be examined in an environment with less information asymmetry

and more established firms going public. Sector 35 (health care) is the most populated sector

which holds the aforementioned characteristics. Hence, this thesis selects the ‘health care’

sector for examining the methodology.

The annual ‘real’ GDP data were obtained from U.S. Department of Commerce: Bureau of

Economic Analysis [53]. The GDP implicit price deflator was used to adjust the data for

inflation. In each case, the base year was set to be the issue year of the respective IPOI (see,

Section 6.2.1). Considering that this thesis includes the fundamental financial data of a diverse

set of firms, the GDP deflator was deemed to be more reliable to carry out inflation

adjustments as opposed to other common measures such as consumer price index (CPI). The

GDP deflator is a metric obtained based on GDP which, itself, is representative of all the final

goods and services produced within the geographical boundaries of an economy. CPI, on the

Chapter 7. Data 7.2. Sector Index and GDP

165

other hand, is derived based on a basket of household goods and services. It would, therefore,

provide a less comprehensive insight into changes in the level of prices.

Table 7-2. Each cell displays the time period spanned by the respective sector and index in COMPUSTAT.

S&P Sector S&P 1500 S&P 400 S&P 500 S&P 600

55 (Utilities) 1994-2013 1995-2013 1983-2013 1994-2013

50

(Telecommunication

Services)

1994-2013 1994-2013 1991-2013 1994-2013

15 (Materials) 1994-2013 1994-2013 1991-2013 1994-2013

10 (Energy) 1994-2013 1994-2013 1986-2013 1994-2013

30 (Consumer

Staples) 1994-2013 1994-2013 1991-2013 1994-2013

40 (Financials) 1994-2013 1994-2013 1983-2013 1994-2013

20 (Industrials) 1994-2013 1994-2013 1984-2013 1994-2013

25 (Consumer

Discretionary) 1994-2013 1994-2013 1991-2013 1994-2013

35 (Health Care) 1994-2013 1994-2013 1987-2013 1994-2013

45 (Information

Technology) 1994-2013 1994-2013 1986-2013 1994-2013

166

Chapter 8. RESULTS

8.1. Results of Phase I: Comparable Selection

As established in Chapter 7, the proposed methodology is examined using the IPOs in the

‘health care’ sector. The process of data mining results in 327 U.S. IPOs spanning the time

period of 1990-2012. The first IPO selected for the analysis is the 52nd IPO on the list; that is,

the ID of the IPOI is 52 or 𝑏𝑏 = 52 (see, Section 6.2.5 and Figure 6-6). The ‘pool of candidates’

for the first IPOI, therefore, consists of 51 past IPOs. Once the algorithm identifies and records

the comparables for the first IPOI with the ID of 𝑏𝑏 = 52, the 53rd IPO on the list would be

selected as the next IPOI and the process is iterated (refer to Chapter 6 for further details).

As detailed in Section 6.2.5, the MATLAB programming software is used to implement

Models 1, 2, and 3 as well as all the steps demonstrated by the flowcharts in Figure 6-6 to

Figure 6-8. In sum, approximately 7000 lines of script are written to execute Phase I. The

reader is referred to Chapter 6 for the specifics.

The proposed methodology for Phase I is run in three different settings:

RUN 1. Run 1 does not execute the two steps of outlier detection and efficient IPOI

treatment (see, Sections 6.2.4 and 6.2.5). Consequently, if an IPOI is identified as

efficient, it is excluded from the macro list of all the 327 IPOs. The reason can be

articulated as follows: No comparables can be identified for an efficient IPOI which

does not undergo the process of ‘efficient IPOI treatment’. Consequently, no prior

and posterior joint distributions of the parameters Μ, Σ, Ξ, and Ω (i.e.

𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,σ, ξ,ω)) can be estimated for the respective efficient IPOI. Thus, should

it be selected as the closest comparable of a succeeding IPOI, it does not hold the

required information to engage in estimating the same distribution for the new IPOI.

RUN 2. The efficient IPOIs are treated; however, the analysis does not execute the

outlier detection step.

Chapter 8. Results 8.1. Results of Phase I: Comparable Selection

167

RUN 3. The efficient IPOIs are treated and the outliers are detected using the

“Jackstrap” approach proposed by De Sousa and Stošić (2005) [45] (see, Sections

6.2.4 and 6.2.5). The number of cycles of bootstrapping is 100. Side analyses

indicated that in the setting of this thesis, the value of ‘leverage’ for a DMU does

not change significantly by increasing the number of bootstraps beyond 100. It

should be noted that the “Jackstrap” approach is computationally expensive and

time consuming. To provide a perspective on the run time, consider the 52nd IPO

scrutinized using 100 cycles where each random sample (without replacement)

contains 35 DMUs; that is, 𝑞𝑞 = 100, 𝑚𝑚 = 35, 𝑏𝑏 = 52 in Figure 6-6 to Figure 6-8. In

this case, the minimum number of optimizations executed for a single IPOI would

be 100×35×34. If the total number of IPOIs (i.e., 276) and the growing pool of

candidates are taken into account, it can be inferred that the process is considerably

time consuming.

It should be noted that the contribution of phase I to the risk analysis of IPOs can only be

tested once Phase II is completed. In this section, however, the outcomes of Phase I can be

examined in terms of the robustness of the methodology in choosing the comparables. Figure

8-1 visualizes the number of comparables associated with each IPOI under RUN 1, RUN 2

and RUN 3. The efficient IPOI treatment and the outlier detection tend to smooth out the

erratic changes in the number of comparables per IPOI. Particularly pronounced is the impact

of outlier detection which is executed in RUN 3. The increased number of comparables per

IPOI in RUN 1 can be attributed to the following cause:

In DEA, once an efficient DMU is excluded, based on its spatial position with respect to the

remaining efficient DMUs, the production possibility set may shrink. The efficient frontier

would consequently be reconfigured and new efficient peers and efficient hyperplanes would

be assigned to the inefficient DMUs which were previously associated with the excluded

efficient DMU. The reshaping may impact the classification of the inefficient DMUs with

identical efficient peers. To further clarify this point, consider Figure 8-2.


168

52 100 150 200 250 300 3270

10

20

30

RUN 1

ID of IPOI

No.

of C

ompa

rabl

es

52 100 150 200 250 300 3270

5

10

15

20

RUN 2

ID of IPOI

No.

of C

ompa

rabl

es

52 100 150 200 250 300 3270

5

10

15

20

ID of IPOI

RUN 3

No.

of C

ompa

rabl

es

Figure 8-1. The bar plots visualize the number of comparables associated with the respective IPOIs whose IDs are presented on the horizontal axis.


169

In Figure 8-2, when the efficient DMU G is excluded, the production possibility set shrinks.

Moreover, the inefficient DMUs which were previously associated with either of the two

hyperplanes GJL and DGJ, are now jointly ‘enveloped’ by the new and larger hyperplane DJL.

This mathematical property gives rise to the difference between the first and second panels in

Figure 8-1; under RUN 1, once an IPOI is identified as an efficient unit, it is excluded from

the ‘pool of candidates’ of any subsequent IPOI. The eliminated efficient IPO might continue

to serve as an efficient unit if it were preserved in the pool of candidates. Its removal can,

therefore, impact the comparables associated with the subsequent IPOIs.

The third panel in Figure 8-1 reflects the impact of detecting outliers in addition to treating

efficient IPOIs. As indicated in Section 6.2.5, outliers significantly impact the form and level

of the frontier. The outliers cap the production possibility set at a higher level. Consequently,

they mask out the impact of other DMUs which could potentially set closer targets for the

respective inefficient IPOI. The production possibility set is segmented more coarsely in the

presence of outliers and their removal can lead to a more detailed clustering of DMUs.

Considering that the comparables for any inefficient IPOI are set to be the firms which share

the same efficient hyperplane with the IPOI, a more detailed clustering would lead to a finer

selection of comparables. This impact is reflected in the smoother bar plot of RUN 3.

Worthy of note is the smaller variance and steady trend depicted by RUN 3 which is more

coherent with practical rationality. Compare the lower panel in Figure 8-1 to the upper two

ones where RUN 1 and RUN 2 exhibit rather increasing trends, implying that a more recent

IPOI is more likely to be linked with a larger group of comparables. The growth of number of

comparables with size of the pool of candidates, nevertheless, seems incompatible with

practical wisdom and intuition since given sufficient data for each IPOI, one does not expect a

latter IPOI to be associated with, on average, more comparables in contrast with a former IPOI

which took place a few years earlier.


170

Figure 8-2. This figure depicts the efficient frontier for the data presented in Table 1.5 of Cooper et al. (2007) [41].

Note that the output ‘Inpatients’ has been eliminated which facilitates visualizing the frontier in 3 dimensions. It is

illustrated how the shape of the frontier changes by the exclusion of the efficient DMU G. The production

possibility set, which is capped by the efficient frontier, spans a smaller space subsequent to the removal of the

efficient DMU G. The segmentation of the frontier changes as well; the number of efficient hyperplanes decreases

in this example.


171

To further examine the robustness of the methodology in choosing comparables, the

following test is carried out: Once the set of comparables for each IPOI is identified, the

algorithm continues to identify and record the comparables of the respective IPO in the

subsequent analyses where it participates only as a member of the pool of candidates of

another IPOI. To elaborate more, consider IPOI with the ID of 𝑏𝑏 = 52. The 52nd IPO is present

in the pool of candidates of the 53rd IPOI, 54th IPOI and other succeeding IPOIs. The

comparables of the 52nd IPO can be identified following the same principles applied in the

case of IPOIs. In the next step, this test computes the average size of all the sets of

comparables identified for the 52nd IPO. The analysis, subsequently, finds the union of all

these sets and calculates the ratio of the average size to the size of the union set. This ratio will

be referred to as the ‘average-union’ ratio, hereafter. The same steps are iterated for each of

the remaining IPOIs. Figure 8-3 exhibits the ratios for individual IPOs in RUN 3.

A value closer to unity for the average-union ratio indicates less variability in the size of the

set of comparables selected for the respective IPO. The ‘average-union’ ratio would be equal

to unity if the set of comparables remains intact for an IPO through all the executions. Ratios

greater than 70% were observed for 72.24%, 65.72% and 64.86% of the IPOIs in RUN 1,

RUN 2 and RUN 3, respectively.

The goal pursued by the previous analysis was to investigate the changes in the size of the

set of comparables for individual IPOs as the number of IPOs participating in the analysis

grows. It is likewise interesting to inspect the changes in the composition of these sets of

comparables. Figure 8-4 visualizes the change in the structure of the comparables of the 52nd

IPO. The horizontal axis represents the number of IPOs that form the pool of candidates for

the respective analysis of comparables. The vertical axis exhibits the union of all the sets of

comparables identified for the 52nd IPO through all the cycles of executions shown on the

horizontal axis. The red color indicates which comparables are selected for the 52nd IPO in the

corresponding round of execution. To elaborate more, consider the number 52 on the

horizontal axis. It indicates that 52 IPOs were included in the analysis. It additionally implies

that the 52nd IPO engages as an IPOI in this cycle. The number 53 indicates that one new IPO


172

(i.e., the 53rd IPO) is added to the previous list. In this new set of size 53, the comparables of

the 52nd IPO are selected following the same method used to identify its comparables when it

was an IPOI. The rest of Figure 8-4 was completed by iterating the same approach.

Figure 8-4 demonstrates that the composition of the set of comparables for the 52nd IPO is

only affected by the addition of new IPOs. The new IPOs change the production possibility set

such that some of the former comparables are replaced with the new IPOs (not the previously

existing ones). Further scrutiny of the data reveals that compared to the retained former

comparables, the excluded former comparables are often positioned farther from the IPO

under investigation (i.e., the 52nd IPO in Figure 8-4) in the 𝜆𝜆 space. Therefore, the new pieces

of information tend to remove these comparables from the set of comparables of the IPO under

Figure 8-3. This figure demonstrates the ‘average-union’ ratio for individual IPOs.


173

investigation. As a case in point, consider the 20th IPO. It almost persistently continues to

perform as a comparable for the 52nd IPO. The examination of 𝜆𝜆 results indicates that for the

first 96 cases (i.e., where the number of participating IPOs varies between 52 and 147), the

20th IPO bears the minimum distance from the 52nd IPO in the 𝜆𝜆 space. In the subsequent cases

where the 20th IPO is present, it is observed to be the second closely positioned IPO to the 52nd

IPO in the 𝜆𝜆 space, dominated by the 148th IPO.

In Figure 8-4, the set of comparables for the 52nd IPO remains stable during the first 15

cycles of execution. The addition of the 67th IPO changes the structure of the comparables set

from [8, 16, 20, 46] to [8, 20, 46, 67]. In other words, on its entry, the 67th IPO replaces the

16th IPO in the set of comparables. The 67th IPO is overridden by the 88th IPO once it is

supplied. The 88th IPO, in turn, is overruled by the 106th IPO upon its addition to the list. Note

that the 8th, 20th, and 46th IPOs continue to serve as comparables during these sequences of

execution until the 8th IPO is dominated by the 148th IPO upon its entry, and the 46th IPO

concedes its position to the 170th IPO once it engages as one of the candidates. The 20th IPO,

however, preserves its status as a comparable. The rest of Figure 8-4 can be interpreted in an

alike fashion. Similar graphs were observed for other randomly selected IPOs (i.e., the 86th,

109th and 201st IPOs) which attest to the robustness of the proposed methodology in choosing

comparables.


174

Figure 8-4. This figure provides an overview of the changes in the composition of the comparables for the 52nd IPO. The horizontal axis

represents the number of candidates (i.e., the IPOs participating in the respective analysis of comparables). The vertical axis exhibits the

union of all the sets of comparables obtained for the 52nd IPO using the candidate sets of different sizes. Under a given set of candidates,

the selected comparables are colored in red.

Chapter 8. Results 8.2. Results of Phase II: Assessment of Short-Term Risk

175

8.2. Results of Phase II: Assessment of Short-Term Risk

8.2.1. Review of the Methodology of Phase II

The previous section examined the robustness of the methodology of Phase I (i.e.,

comparable selection). The focus of this section shifts to scrutinizing Phase II which utilizes

the outcomes of Phase I to quantify the short-term risk of investing in an IPO. It has now been

established that Phase I pursues the goal of identifying a group of comparables for every IPOI

on the list of 276 IPOIs. Phase I was originally inspired by the need for a scientific basis which

would compensate for the lack of sufficient history required to carry out the risk analysis of

individual IPOIs. It allows for finding a given IPOI’s closest comparable and utilizing its

relevant historical data in deriving inferences about the respective IPOI. Evidently, the risk of

each IPOI on this list is analyzed on the basis of the preceding IPOs since if one stands at the

issue time of a given IPOI, none of the succeeding IPOs has yet taken place. Figure 6-10

provides a visual aid for demonstrating the process and is frequently cited in this Section. It is,

therefore, duplicated here for convenience. Each column in this figure depicts a separate round

of execution which analyzes the risk of the IPOI highlighted in orange, drawing upon all the

IPOs listed in the corresponding upper cells. Further details on this figure can be found in

Section 6.3.2.1.

Section 8.2 aims at presenting the outcomes of Phase II. The methodology is detailed at

length in Sections 6.2 and 6.3. The reader is encouraged to review these sections since they

form the foundation of the discussions that follow and contain comprehensive expositions of

the risk assessment approach proposed in this research. As a glance at Chapter 6 would

indicate, the procedure is sophisticated and not definable by simple formulae. Therefore, in the

interest of space and not exhausting the patience of an attentive reader, the author must be

excused for not restating all the steps and for restricting the content of this section to a brief

review that follows.


176

1 1 1 1

1 1

2 2 2 2 2 2

3 3 3 3 3 3

⋮ ⋮ ⋮ ⋮ ⋮ ⋮

50 50 50 50 50 50

51 51 51 51 51 51

52

(IPOI1) 52 52 52 52 52

53

(IPOI2) 53 53 53 53

54

(IPOI3) 54 54 54

55

(IPOI4) 55 55

⋱ 56 56

57 57

⋮ ⋮

323 323

324 324

325 325

326

(IPOI275) 326

327 (IPOI276)

Figure 6-10. Using the data of the ‘health care’ sector (Sector 35 under GICS), this figure provides an

overview of the IPOIs and their ‘pools of candidates’. Each column presents the IPOs used to carry out

Phase I and Phase II for the respective IPOI. The reader is referred to Section 6.2.1 for further information

on the pool of candidates. The first 51 IPOs are referred to as ‘pioneers’.

Pool

of C

andi

date

s for

IPO

I 3

Pool

of C

andi

date

s for

IPO

I 4

Pool

of C

andi

date

s for

IPO

I 2

Pool

of C

andi

date

s for

IPO

I 1


177

In concise terms, this research aims at estimating the unconditional probability density

function of price at the time horizon of 90 days (𝑉𝑉90), for each of the 276 IPOIs depicted on

Figure 6-10. This density is referred to as PDF90IPOI and would subsequently act as the basis of

VaR computations for the corresponding IPOI. Approximately 4000 lines of script have been

written to execute Phase II.

The random behaviour of the stock price (𝑉𝑉𝑡𝑡) has been assumed to be described by a GBM.

Thus, if all the parameters of the GBM (i.e., 𝜇𝜇, 𝜎𝜎, and 𝑣𝑣1) were given, PDF90IPOI would be a log-

normal distribution, computed using Equation (6.41). Nonetheless, these parameters are not

deterministic but indeed realizations from the random variables Μ, Σ, and 𝑉𝑉1. Particularly, 𝑉𝑉1

has been assumed to come from a log-normal distribution whose underlying parameters (i.e.,

hyperparameters) are themselves random. Since the offer price is known a priori for a given

IPO, the ratio of first closing price to offer price would likewise follow a log-normal

distribution whose hyperparameters are designated with 𝜉𝜉 and 𝜔𝜔, representing the mean and

standard deviation of the associated normal distribution, respectively. Considering that 𝜉𝜉 and

𝜔𝜔 could potentially be correlated with 𝜇𝜇 and 𝜎𝜎, their distributions must be considered jointly in

order to find PDF90IPOI. This joint probability density function is denoted by 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔),

and if it is supplied for an IPOI, its predictive density PDF90IPOI can be computed using

Equation (6.42).

The Bayesian perspective offers a unique framework to tackle this problem (see, Chapter 5,

particularly, Section 5.5). For each IPOI, there exist limited data, yet often strong prior beliefs

regarding its performative and ostensive aspects (Feldman and Pentland, 2003 [54]) – a setting

that can best be served by the Bayesian approach. Such prior views typically emerge by

examining the performances of the respective comparables, and inspired by this insight, this

thesis approximates the prior joint density of a given IPOI (e.g., IPOI100 in Figure 6-10) by the

posterior joint density of its closest comparable. The estimated prior joint density was the only

missing piece in the puzzle of PDF90IPOI100, and when it is supplied, one can proceed with

deriving VaR inferences. Once IPOI100 itself hits the market and accumulates sufficient


178

history, its prior joint density is updated by Equation (6.43), using its 90-day price trajectory.

The obtained posterior joint density is then recorded to serve as the prior joint density of an

ensuing IPOI whose closest comparable is selected to be IPOI100 (see, Section 6.3.2.1).

The next challenge of the research appears in a case where the closest comparable of an

IPOI (e.g., IPOI1) resides in the category of ‘pioneers’, highlighted in green in Figure 6-10.

None of the pioneers is scrutinized under Phase I (see, Section 6.2.1). Consequently, the

selected pioneer would not be associated with any ‘closest comparable’ whose posterior joint

density could play the role of the prior joint density for the selected pioneer. Evidently, when

there is no prior joint density associated with the selected pioneer, no posterior density can be

found, to be adopted as the prior density of IPOI1. To address this challenge, this research

draws on the notion of ‘non-informative’ priors (see, Section 5.3, Page 95) and attributes a

‘uniform’ density to all the pioneers to serve as their prior joint density. This uniform density

is individually updated for each pioneer, using the respective 90-day price trajectory, and the

yielded posteriors would be reserved for cases like IPOI1. The next section expounds how the

range of each dimension of the uniform joint density (i.e., Μ, Σ, Ξ, and Ω) is set.

8.2.2. Uniform Prior Joint Density

Aside from possible ranges for the random variables of Μ, Σ, Ξ, and Ω, little is known about

them. The Bayesian analysis would, therefore, bound itself to exercise no preference over any

specific combination of the values, residing within these intervals, which may be realized.

This ignorance can be reflected by defining the prior joint density as a uniform density (see,

Section 5.3) over a four-dimensional volume spanned by the ranges of Μ, Σ, Ξ, and Ω; that is,

RΜ, RΣ, RΞ, RΩ, respectively. The prior joint density in the case of pioneers can then be

expressed mathematically as shown below:

𝑓𝑓Μ,Σ,Ξ,Ωprior (𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) =

I(𝜇𝜇 ∈ RΜ)RΜ

×I(𝜎𝜎 ∈ RΣ)

RΣ×

I(𝜉𝜉 ∈ RΞ)RΞ

×I(𝜔𝜔 ∈ RΩ)

RΩ (8.1)


179

In Equation (8.1), I(. ) stands for an indicator function which is assigned the value of 1

when its condition is true and equals 0 if its argument returns false. The next two subsections

(i.e., Sections 8.2.2.1 and 8.2.2.2) are devoted to explaining how this research estimates RΜ,

RΣ, RΞ and RΩ.

8.2.2.1. Estimating 𝐑𝐑𝚳𝚳 and 𝐑𝐑𝚺𝚺

The ranges of expected daily rate of return (i.e., RM) and daily volatility (i.e., RΣ) are

estimated according to the steps described below:

Step 1. For each pioneer, the prices are inflation-adjusted over the respective 90-day

period, choosing day 90 as the base day. Inflation rate is an economic variable

that is perceived to be impacting stock returns. Since the posterior joint density

of a pioneer – which eventually emerges based on these returns – can potentially

be adopted by a succeeding IPOI as its prior joint density, correcting for

inflation would facilitate such cross-time transmissions. Drawing on the same

logic, prior to any analysis, the realized 90-day sequence of each IPOI is

likewise adjusted for inflation on the basis of the corresponding 𝑣𝑣90. The

inflation adjustments are conducted using the GDP deflator data.

Step 2. For each pioneer, the random behaviour of price is modeled by a GBM whose

analytic solution is presented in Chapter 6 (see, Equation (6.40)). In individual

cases, the parameters 𝜇𝜇 and 𝜎𝜎 are calibrated relying on the corresponding 90-day

price trajectory and using the method of maximum likelihood estimation (MLE)

whose key idea is to find a pair (�̂�𝜇,𝜎𝜎�) that would produce the best fit to the

observed historical returns. The details of the calibration process can be found

below in Section 8.2.2.1.1.

Step 3. For each pioneer, the 95% confidence intervals of the estimates of 𝜇𝜇 and 𝜎𝜎 are

computed, which reflect the degree of certainty in the rule yielding the

confidence intervals (see, Section 8.2.2.1.1).


180

Step 4. Executing Steps 1 to 3 for all the 51 pioneers would produce 51 pairs of (�̂�𝜇𝑖𝑖,𝜎𝜎�𝑖𝑖),

51 confidence intervals for 𝜇𝜇 (i.e., �̂�𝜇𝐿𝐿𝑖𝑖 ≤ 𝜇𝜇𝑖𝑖 ≤ �̂�𝜇𝑈𝑈𝑟𝑟), and 51 confidence intervals

for 𝜎𝜎 (i.e., 𝜎𝜎�𝐿𝐿𝑖𝑖 ≤ 𝜎𝜎𝑖𝑖 ≤ 𝜎𝜎�𝑈𝑈𝑟𝑟), where 𝑗𝑗 = 1,…, 51 designates the respective pioneer

and �̂�𝜇𝐿𝐿𝑖𝑖 and �̂�𝜇𝑈𝑈𝑟𝑟 denote the lower and upper bounds of the confidence interval

for 𝜇𝜇𝑖𝑖. Similarly, 𝜎𝜎�𝐿𝐿𝑖𝑖 and 𝜎𝜎�𝑈𝑈𝑟𝑟 represent the boundaries of the confidence interval

for 𝜎𝜎𝑖𝑖.

Step 5. In order to ensure that RΜ and RΣ are sufficiently broad ranges which

encompass all the typical values that can be attributed to expected daily rate of

return and daily volatility, the union of all the corresponding sets of confidence

intervals are computed to serve as RΜ and RΣ; that is, RΜ encloses the union of

all the 51 intervals of [�̂�𝜇𝐿𝐿𝑖𝑖 , �̂�𝜇𝑈𝑈𝑟𝑟], and RΣ is set to be the collection of all the

values residing in the 51 intervals of [𝜎𝜎�𝐿𝐿𝑖𝑖,𝜎𝜎�𝑈𝑈𝑟𝑟]. The process is visualized in

Figure 8-5.


181

Figure 8-5. This figure visualizes the process of determining RΜ and RΣ.


182

Key Consideration 1.

Since calibration in each case is conducted based on a 90-day price trajectory, the

confidence intervals may be wide to reflect the uncertainty around the respective 𝜇𝜇 and 𝜎𝜎

estimations. Particularly pronounced is the impact of sample size on the width of the

confidence interval of 𝜇𝜇 (see, Equation (8.7)); that is to say, generally, only would a long

history yield a reliable estimate of 𝜇𝜇, and indeed, a wider, rather than tighter, confidence

interval is logically more coherent with the interpretation of the expected rate of return; if one

could estimate 𝜇𝜇 with a high level of precision and accuracy, then, one could confidently

speculate on the direction and magnitude of the drift which is, by nature, an idealistic scenario.

Nonetheless, this research is not concerned with achieving a dead-on (so to speak) single

estimate for 𝜇𝜇 or 𝜎𝜎, in the case of every pioneer, but it seeks a sufficiently wide, yet focused,

global interval that encompasses all the typical values of 𝜇𝜇 or 𝜎𝜎, for all pioneers. In other

words, this research endeavours to obtain the broadest narrow interval such that any value that

resides beyond its bounds can safely be claimed to be associated with a negligible probability.

This goal would adequately be met even if the respective 95% confidence interval of each

pioneer were used as its specific RΜ or RΣ, which would eventually lead to pioneer-specific

uniform prior joint densities. Yet, this study lies on the conservative side by adopting the union

of all such intervals to serve as the global estimate of RΜ or RΣ.

It is worthy of remarking that the foremost prior joint density is chosen to be uniform to

reflect prior ignorance. For each pioneer, this prior joint density is subsequently updated by the

respective ‘likelihoods’, leading to a more concentrated posterior density compared to the

preliminary prior. A side analysis indicates that in the case of each pioneer, after sequentially

updating the uniform density for (90-1) times, the mass of the probability density retreats from

the boundaries toward the center, supporting the initial claim that beyond RΜ and RΣ, the

probabilities are negligible.

In further support of the proposed path for estimating RΜ and RΣ, it must be stated that the

foremost prior joint density utilized here does not express a complete ignorance; this research


183

has adopted a comprehensive approach to ensure that the narrowest conservative intervals are

approximated for RΜ and RΣ. The Bayesian framework indeed allows for less complex approaches,

without trading much accuracy for simplicity. One extension of such views to this research would

imply setting RΜ and RΣ such that they simply represent dominant beliefs regarding the inflation-

adjusted 𝜇𝜇 and 𝜎𝜎 in the ‘health care’ sector. Below, one such example is presented, adopted from

Rachev et al. (2008) [115]. The approach appears inferior to the one suggested by this thesis, yet is

prevalently used.

Suppose that the random behaviour of the monthly return of an asset is best described by the

normal N(𝑟𝑟, 𝑠𝑠2) and the returns are identically and independently distributed. Further, assume that

the variance 𝑠𝑠2 is given; whereas, the mean 𝑟𝑟 is not deterministic and itself represents a realization

from a probability distribution whose prior form is conjectured to be N(𝑎𝑎,𝑑𝑑2). The problem,

therefore, boils down to finding 𝑎𝑎 and 𝑑𝑑 which would subsequently determine the dynamics of 𝑟𝑟.

In this case, the authors (Rachev et al., 2008) [115] estimated the two parameters 𝑎𝑎 and 𝑑𝑑 by solely

relying on their perspective of the market. As an illustration, if the median of average monthly

return is viewed to be typically 1%, it would imply that 𝑎𝑎 is equal to 1%. Furthermore, the

parameter 𝑑𝑑 can be specified based on a belief such as ‘the probability of observing an average

monthly return less than 0.5% is not more than 25%’. Such speculation can be translated into the

statement that the first quartile equals 0.5% upon which 𝑑𝑑 can be calculated.

Stated in a concise manner, this research does not pursue to find a single strictly correct estimate

for 𝜇𝜇 or 𝜎𝜎, nor is it concerned with deriving the long-term approximation of 𝜇𝜇 or 𝜎𝜎. On the

contrary, it aims at acquiring two intervals which would reflect the prior evidence on the ranges of

typical values for short-term expected daily rate of return and daily volatility. In such a context,

one expects to achieve reliable outcomes even if these intervals are simply set based on arbitrary

facts and anecdotes or dominant views in the industry – an anticipation that emerges from the

capacity of the Bayesian methodology to sequentially update and refine the prior knowledge and

filter out irrelevant prior information. Yet, in a more prudent approach, this thesis utilizes the 95%

confidence intervals and takes the conservatism one step forward by determining RΜ and RΣ as the

union of all the respective confidence intervals.


184

8.2.2.1.1. Maximum Likelihood Estimation Method

For the sake of thoroughness, the method of maximum likelihood estimation employed to

approximate 𝜇𝜇 and 𝜎𝜎 is outlined below. The reader is encouraged to consult texts of statistics

for a detailed review. Assume that the 90-day price trajectory of IPO𝑖𝑖 is represented as

follows: (𝑣𝑣𝑖𝑖,1,𝑣𝑣𝑖𝑖,2, 𝑣𝑣𝑖𝑖,3, … , 𝑣𝑣𝑖𝑖,90). This 90-tuple is indeed a realization from the random

sequence of (𝑉𝑉𝑖𝑖,1,𝑉𝑉𝑖𝑖,2,𝑉𝑉𝑖𝑖,3, … ,𝑉𝑉𝑖𝑖,90). The maximum likelihood estimates of 𝜇𝜇 and 𝜎𝜎 are defined

as the values which would maximize the probability of observing the vicinity of the array

(𝑣𝑣𝑖𝑖,1, 𝑣𝑣𝑖𝑖,2, 𝑣𝑣𝑖𝑖,3, … , 𝑣𝑣𝑖𝑖,90).

As can be deduced from the analytic solution of the stochastic differential equation of GBM

(see, Equation (6.40)), the logarithmic period returns (i.e., ln 𝑉𝑉𝑟𝑟,2

𝑉𝑉𝑟𝑟,1, ln 𝑉𝑉𝑟𝑟,3

𝑉𝑉𝑟𝑟,2, … , ln 𝑉𝑉𝑟𝑟,90

𝑉𝑉𝑟𝑟,89) are

identically and independently distributed, and their dynamics can be described by the

normal N(𝑟𝑟, 𝑠𝑠2) = N��𝜇𝜇 − 𝜎𝜎2

2� Δ𝑡𝑡,𝜎𝜎2Δ𝑡𝑡�. By virtue of the i.i.d. assumption for the sequence

(ln 𝑉𝑉𝑟𝑟,2

𝑉𝑉𝑟𝑟,1, ln 𝑉𝑉𝑟𝑟,3

𝑉𝑉𝑟𝑟,2, … , ln 𝑉𝑉𝑟𝑟,90

𝑉𝑉𝑟𝑟,89), the joint distribution of the logarithmic returns for IPO𝑖𝑖 can be

expressed by the following equation:

𝑓𝑓𝑋𝑋𝑟𝑟,1,𝑋𝑋𝑟𝑟,2,…,𝑋𝑋𝑟𝑟,89�𝑥𝑥𝑖𝑖,1, 𝑥𝑥𝑖𝑖,2, … , 𝑥𝑥𝑖𝑖,89�𝑟𝑟, 𝑠𝑠� = �𝑓𝑓𝑋𝑋𝑟𝑟,𝑖𝑖�𝑥𝑥𝑖𝑖,𝑖𝑖�𝑟𝑟, 𝑠𝑠�89

𝑖𝑖=1

(8.2)

where, the random variable 𝑋𝑋 stands for logarithmic return, and 𝑥𝑥 denotes a realization from

𝑋𝑋.

When an observed sequence (𝑥𝑥𝑖𝑖,1, 𝑥𝑥𝑖𝑖,2, … , 𝑥𝑥𝑖𝑖,89) is substituted in the joint density function

of IPO𝑖𝑖, shown by Equation (8.2), the only unknowns would remain 𝑟𝑟 and 𝑠𝑠 (consequently, 𝜇𝜇

and 𝜎𝜎). The resultant function is referred to as the likelihood function and can be maximized in


185

𝑟𝑟 and 𝑠𝑠 to obtain the maximum likelihood estimates, denoted by �̂�𝑟 and �̂�𝑠, respectively. Note

that the product shown in the preceding equation can become considerably small. Thus, to

avoid numerical problems, one needs to maximize the log-likelihood function. In the case of

GBM, the normality assumption yields a deterministic form for 𝑓𝑓𝑋𝑋1,𝑋𝑋2,…,𝑋𝑋89(𝑥𝑥1, 𝑥𝑥2, … , 𝑥𝑥89|𝑟𝑟, 𝑠𝑠)

and leads to the following familiar expressions for �̂�𝑟 and �̂�𝑠:

�̂�𝑟 =1𝑛𝑛� 𝑥𝑥𝑖𝑖

𝑛𝑛=89

𝑖𝑖=1

(8.3)

�̂�𝑠2 =1

𝑛𝑛 − 1� (𝑥𝑥𝑖𝑖 − �̂�𝑟)2𝑛𝑛=89

𝑖𝑖=1

(8.4)

Subsequently, 𝜇𝜇 and 𝜎𝜎 can be estimated as shown below:

�̂�𝑟 = ��̂�𝜇 −𝜎𝜎�2

2�Δ𝑡𝑡 (8.5)

�̂�𝑠2 = 𝜎𝜎�2Δ𝑡𝑡 (8.6)

Suppose 𝑄𝑄𝑛𝑛 represents the average of i.i.d. normal random variables; i.e. 𝑄𝑄𝑛𝑛 = 𝑋𝑋1+⋯+𝑋𝑋𝑛𝑛𝑛𝑛

.

Since the average itself is normally distributed �𝑄𝑄𝑛𝑛~N �𝑟𝑟, 𝑠𝑠2

𝑛𝑛��, the (1 − Ƿ) × 100 %

confidence interval of 𝑟𝑟 can be achieved using the following equation:


186

𝑞𝑞𝑛𝑛 − CDFSN,Ƿ/2−1 �̂�𝑠

√𝑛𝑛≤ 𝑟𝑟 ≤ 𝑞𝑞𝑛𝑛 + CDFSN,Ƿ/2

−1 �̂�𝑠√𝑛𝑛

(8.7)

where, 𝑞𝑞𝑛𝑛 represents an observation from the random variable 𝑄𝑄𝑛𝑛 and can indeed be

approximated by the maximum likelihood estimate of 𝑟𝑟. The variable CDFSN,Ƿ/2−1 denotes the

absolute value of the inverse cumulative density function (CDF−1) of the standard normal

distribution (SN), at the cumulative probability of Ƿ/2. It is equal to 1.96 if Ƿ = 0.05.

The confidence interval of 𝑠𝑠2 can be estimated as shown below:

(𝑛𝑛 − 1)�̂�𝑠2

CDF𝜒𝜒𝑛𝑛−12 ,(1−Ƿ/2) −1 ≤ 𝑠𝑠2 ≤

(𝑛𝑛 − 1)�̂�𝑠2

CDF𝜒𝜒𝑛𝑛−12 ,Ƿ/2 −1 (8.8)

In this equation, CDF𝜒𝜒𝑛𝑛−12 ,Ƿ/2 −1 and CDF𝜒𝜒𝑛𝑛−12 ,(1−Ƿ/2)

−1 respectively represent the lower and upper

Ƿ/2 × 100 % quantiles of the chi-square distribution with the degree of freedom of (𝑛𝑛 − 1).

Equation (8.8) is built upon the principles that the square of a standard normal random

variable follows a chi-square distribution and the sum of 𝑛𝑛 independent chi-square random

variables continues to be a chi-square random variable with the degree of freedom of (𝑛𝑛 − 1).

8.2.2.2. Estimating 𝐑𝐑𝚵𝚵 and 𝐑𝐑𝛀𝛀.

It was established in Section 6.3.2 that the inflation-adjusted ratio of ‘closing price of the

first day’ to ‘IPO offer price’ can be assumed to follow a log-normal distribution. This

conclusion was reached through scrutinizing the ratios of all the 327 IPOs from the ‘health

care’ sector. Drawing on this knowledge, RΞ and RΩ can be determined by fitting a lognormal

distribution to the ratios of the pioneers and computing the 95% confidence intervals of ω and


187

ξ ; that is, RΩ = 𝜔𝜔�𝐿𝐿 ≤ ω ≤ 𝜔𝜔�𝑈𝑈 and RΞ = 𝜉𝜉𝐿𝐿 ≤ ξ ≤ 𝜉𝜉𝑈𝑈, where, 𝜔𝜔�𝐿𝐿 and 𝜔𝜔�𝑈𝑈 denote the lower

and upper bounds of the confidence interval of 𝜔𝜔, respectively. Likewise, 𝜉𝜉𝐿𝐿 and 𝜉𝜉𝑈𝑈 mark the

lower and upper boundaries of the confidence interval of 𝜉𝜉.

Finally, the approaches outlined in this section yield the following estimations for RΜ, RΣ,

RΞ, and RΩ:

RΜ: [−0.0196, 0.0211]

RΣ: [0.0175, 0.0848]

RΞ: [0.0419, 0.1310]

RΩ: [0.1325, 0.1969]

As detailed above, the objective was to find the narrowest broad intervals beyond which reside

the least probable values; that is, any combination of (𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) that falls outside the domain

bounded by RΜ, RΣ, RΞ, and RΩ can reliably be assigned the probability of zero. It is

worthwhile to reiterate the insight that for a given pioneer, the uniform joint density defined

by RΜ, RΣ, RΞ, and RΩ (see, Equation (8.1)) is subsequently updated in light of the

corresponding observed data. Likewise, it gets more refined and becomes more concentrated

as it further propagates in time, across the succeeding IPOIs. The new evidence from each

relevant future IPOI can be regarded as a numerical sieve which revises the existent

probabilities and further pulls the mass of the joint density from the boundaries of RΜ, RΣ, RΞ,

and RΩ, and toward the center.

Finally, as a guide for the practical aspects of the procedure, it should be indicated that each

of the intervals shown above is divided into 100 equal segments. Consequently, 108,

computationally burdensome, unique combinations of (𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔) are achieved which can be

utilized to execute Equations (6.42) and (6.43) for individual IPOs. Below, two more pieces of


188

information are provided on the nuts and bolts of the process of estimating PDF90IPOI for each

IPOI. These ranges are deemed suitable and adequately conservative to cater to potential

extreme scenarios and address the tails of the distribution PDF90IPOI:

0.1 ≤ 𝑉𝑉1/𝑉𝑉offer ≤ 10

0.0001 ≤ 𝑉𝑉90/𝑉𝑉offer ≤ 10

The resolution of 𝑉𝑉1/𝑉𝑉offer is set as 1/100 of its range, and the interval of 𝑉𝑉90/𝑉𝑉offer is

segmented into 600 parts which are of equal length on the logarithmic scale.


189

Key Consideration 2.

It can be inferred from Section 8.2.2 that ‘empirical’ expected rate of return (i.e., 𝜇𝜇) will

be employed in estimating PDF90IPOI, and hence, in deriving VaR inferences. As a side note,

it is worthwhile to give insight as to why VaR estimations are always developed in the

‘real-world’, as opposed to the ‘risk-neutral’, mode. To delineate, one must first pinpoint

the underlying principles of the risk-neutral approach which, in essence, emerged as a

means for pricing. It is indeed a forward-looking concept drawing on the notion of ‘no-

arbitrage’ to determine a price which ensures that no ex-ante arbitrage profit can be locked

in by any of the investing parties.

As a simple case in point, consider an investment in a European call option in a setting

with no transaction costs and dividends. The risk-neutral concept relies upon the notion of

creating a ‘riskless’ portfolio which would earn a return equal to the risk-free interest rest

in the absence of arbitrage opportunities. In the case of pricing a call option on a stock,

such a riskless position can be established by holding a short position on the option and a

long position in Δ shares of the same stock. The number of shares, Δ, is set such that there

is no uncertainty regarding the value of this portfolio at the end of an infinitesimal period;

hence, its return should correspond to the risk-free rate. In this setting, one can solve for the

only remaining unknown; that is, the price of the option at the beginning of the period.

Delta neutrality or hedge of the position is only temporary and can be preserved by

‘rebalancing’ the portfolio in sufficient frequency through dynamically adjusting the

number of the stocks (i.e., Δ).

The risk-neutral concept indeed offers a framework where pricing can be carried out

with the assumption that the investors are risk-neutral and as long as the riskless position is

maintained, the ‘empirical’ expected rate of return, or the individuals’ risk preferences, are

irrelevant. From a practical perspective, pricing in a setting where both parties are risk-


190

neutral is only intuitive since the outcome would reconcile the incentives of both agents

and remain compatible with the incentive structure of the game of market exchange.

In contrary to the objective pursued by the risk-neutral wisdom – which is setting up

instantaneously hedged, and hence riskless, portfolios for the sole sake of pricing – VaR

focuses on estimating the risk in an event where a position remains unhedged. It is indeed

the very characteristic of risk-neutrality that renders VaR computations irrelevant in the

risk-neutral world. Ultimately, it is a no-arbitrage world, where investors are concerned

with constant hedging and realizing risk-free returns. They are, therefore, certain about

their risk and do not seek, nor do they need, a measure like VaR to gauge the riskiness of

their positions.

Outside the risk-neutral context, however, lies the ‘real-world’, where not all the

positions are hedged and investors trade increased risks for increased rewards. It is a world

where an investor may wish to hold (lock) a risky position, with no or partial protection, in

an attempt to potentially achieve a higher profit. Within such a setting, individual risk

preferences, and empirical expected rate of return (i.e., 𝜇𝜇), play a central role and investors

can benefit from a concept like VaR to gauge their exposure. In a nutshell, it can be stated

that regardless of the pricing techniques utilized to arrive at fair price estimates, the gain or

loss of a given portfolio, at a predetermined time horizon, is ultimately a function of the

empirical expected rate of return. Consequently, it can be concluded that the respective

gain-loss distribution as well as VaR inferences are contingent on ‘real’, not ‘risk-neutral’,

probabilities. The reader is encouraged to consult the following literature for more

explanatory comments on why a risk-neutral mode would not offer a suitable framework

for risk management: Bliss and Panigirtzoglou (2004) [28], Vincent-Humphreys and Noss

(2012) [138], Hull (2012) [75], Giordano and Siciliano (2013) [60].


191

8.2.3. Calibrating the Distance Function

Previous discussions have made it plain that one requisite for estimating a given IPOI’s

predictive density (i.e., PDF90IPOI) is its closest comparable’s posterior joint density of the

parameters. Since each IPOI is associated with more than one comparable, a metric is required

to measure the similarity between the IPOI and each of its comparables. Equation (6.37),

suggested in Section 6.3, serves this goal and assists with pinpointing the best match for the

given IPOI. Below, it is repeated for the ease of reference:



𝑐𝑐 = 1, … , 𝜂𝜂 (6.37)

Recall that in this equation, 𝜂𝜂 represents the total number of comparables associated with an

IPOI, marked by the ID of 𝑏𝑏, and the subscript 𝑐𝑐 designates the ID of the comparable under

scrutiny. The closest comparable is selected as the one which leads to the minimum 𝐷𝐷𝑐𝑐𝑐𝑐.

Phase I computes the values of 𝜆𝜆 and 𝜓𝜓 for a given IPOI and its comparables. Thus, the

only unknown of Equation (6.37) remains 𝛼𝛼 which is estimated using a maximum likelihood

approach (see, Section 6.4). The algorithm of the method is visualized by the flowchart

presented in Figure 6-11. Concisely, the calibration process seeks an estimate of 𝛼𝛼 which

would maximize the probability of the realized sequence of 𝑉𝑉90’s: P�𝑉𝑉90IPOI1 ≈ 𝑣𝑣90

IPOI1 ,𝑉𝑉90IPOI2 ≈

𝑣𝑣90IPOI2 , … ,𝑉𝑉90

IPOI276 ≈ 𝑣𝑣90IPOI276|𝛼𝛼�. To further make sense of the calibration process, consider

Figure 8-6. This figure exhibits the PDF90IPOI of the last IPOI (i.e., the 276th IPOI) which is

generated by temporarily assigning the value of 1.5 to 𝛼𝛼; that is, the closest comparable of this

IPOI is identified based on the assumption that 𝛼𝛼 = 1.5 in Equation (6.37), and subsequently,

the corresponding PDF90IPOI is estimated by executing Phase II. It is noteworthy that the area

under the curve of any estimated PDF90IPOI should be computed to examine whether it equals


192

unity; otherwise, the estimated PDF90IPOI must be normalized. In this research, every estimated

PDF90IPOI met the condition of unit area under the curve, and hence, no normalization was

carried out.

Figure 8-6. The graph visualizes the estimated PDF90IPOI for the 276th IPOI (i.e., 327th IPO). The dashed line

marks the observed price on day 90 (𝑣𝑣90realized) normalized by the respective inflation-adjusted offer price

(𝑣𝑣offer). Its position is an indicative of the probability that would be associated with the event 𝑣𝑣90realized/

𝑣𝑣offer ≈ 1.015 if the proposed methodology of this thesis is employed and if 𝛼𝛼 is set equal to 1.5. A similar

process is undertaken for each IPOI in order to achieve the value of the yielded PDF90IPOI at the respective

𝑣𝑣90realized/𝑣𝑣offer. These values are recorded to be subsequently utilized in determining the likelihood of 𝛼𝛼 =

1.5 following the steps described within the text.


193

Next, in a backtesting approach, the probability of observing the vicinity of the realized V90

(i.e., 𝑣𝑣90realized < 𝑉𝑉90 ≤ 𝑣𝑣90realized + d𝑣𝑣) is recorded; that is, PDF90IPOI × d𝑣𝑣. To avoid numerical

complications, the value of the probability density function (PDF90IPOI) at 𝑣𝑣90realized is stored, as

opposed to a small probability pertaining to the event 𝑣𝑣90realized < 𝑉𝑉90 ≤ 𝑣𝑣90realized + d𝑣𝑣. Since

d𝑣𝑣 is defined as an infinitesimally small change in price which remains consistent across

different assumptions of 𝛼𝛼, it would be cancelled out if it were maintained in the process.

The aforementioned steps are iterated for individual IPOIs, and in an alike fashion, for each

IPOI, the value of its PDF90IPOI at the respective realized 𝑉𝑉90 is retrieved. The iteration process

returns an array of length 276 for a given 𝛼𝛼, which comprises the PDF values at the

corresponding 𝑣𝑣90realized’s. The final step involves calculating the logarithm of the product of

all the elements in this array. As is readily perceived, the logarithm function is used to avoid

obtaining an excessively small decimal which is inevitable due to the dimension of the

problem; the estimated product indeed represents the value of a joint pdf – which spans the

ranges of 276 variables in a (276+1)-dimensional space – at a given point whose position is

uniquely determined by 276 coordinates. The reader is encouraged to review Equations (6.46)

and (6.47) for the mathematical expression of the approach.

With each update on the value of 𝛼𝛼, a new likelihood would be yielded by the above-

indicated process. In other words, the probability of observing the sample sequence of

276 𝑣𝑣90realized’s varies under different assumptions of 𝛼𝛼. Figure 8-7 represents the outcomes of

the calibration process. The likelihood function reaches its maximum at 𝛼𝛼 ≅ 1.5, and hence,

the maximum likelihood estimate of 𝛼𝛼 is equal to 1.5.

The preceding discussions provided insight that Phase II is a computationally intensive

process. After exhaustively optimizing the algorithm to attain a satisfactory speed, it takes

approximately 20 hours to obtain each point depicted on Figure 8-7, using a machine with 16.0

GB of RAM and the processor of Intel(R) Core(TM) i7-3770 CPU @ 3.4 GHz. In the interest

of efficient execution of the calibration process, it was first implemented using the following 9

values of 𝛼𝛼: [0, 0.5, 1, 1.5, 2, 2.5, 3, 4, 5]. Then, in an ‘adaptive search’ manner, the resolution


194

was increased within the ranges deemed the most susceptible to change in order to further

explore the impact of changing 𝛼𝛼.

Note that starting at the vicinity of 𝛼𝛼 = 3, the log-likelihood function stabilizes which can

be explained as follows: For a given IPOI, beyond a certain threshold, the 𝛼𝛼 term (i.e., the

second term) in Equation (6.37) would dominate, effectively identifying the closest

comparable as the one whose efficiency score (𝜓𝜓) falls the closest to the given IPOI’s 𝜓𝜓.

Figure 8-7. This graph exhibits the log-likelihood function of 𝛼𝛼. Each depicted point is obtained by executing

the following two main steps: Step 1 assigns a value to 𝛼𝛼 and executes phase I and phase II for each of the

276 IPOIs which results in 276 individual PDF90IPOI’s under the postulated 𝛼𝛼. Step 2 computes the logarithm

of the product of 276 elements each representing the value of the PDF90IPOI of a given IPOI at the respective

realized 𝑉𝑉90. The likelihood function hits its maximum at 𝛼𝛼 ≅ 1.5.


195

Therefore, beyond that threshold, the closest comparable for the specified IPOI would remain

intact. Now, suppose that LIST𝛼𝛼 is a vector of length 276 with each cell populated by the ID of

the closest comparable of the corresponding IPOI. It can be deduced from Figure 8-7, once 𝛼𝛼

hits the value of approximately 3, the 𝛼𝛼 term dominates the similarity metric in all the 276

IPOI cases and LIST𝛼𝛼 remains unchanged from then onward. If the array of closest

comparables (i.e., LIST𝛼𝛼) stays fixed, so does the list of posterior joint densities of the

parameters, required to run Phase II. Recall that in estimating the PDF90IPOI for a given IPOI, its

closest comparable’s posterior joint density is adopted as the prior joint density of the IPOI.

Hence, if the posterior density does not vary, the ensuing PDF90IPOI would likewise remain

invariant, leading to a stable log-likelihood quantity.

To further scrutinize Figure 8-7, it is worth directing attention to the small variations

between the values depicted on the vertical axis. Since these numbers represent logarithmic

likelihoods, their differences only appear insignificant; yet, the quotients derived based on

them are significant. To elaborate more, consider the differences between the values of the

log-likelihood function at 𝛼𝛼 = 0, 𝛼𝛼 = 1.5, and 𝛼𝛼 = 2.5. Refer to the corresponding likelihoods

as 𝐴𝐴, 𝐵𝐵, and 𝐶𝐶, respectively. The following subtractions then result from Figure 8-7:

log𝐵𝐵 − log𝐴𝐴 = 0.4122 →𝐵𝐵𝐴𝐴

= 100.4122 = 2.5832

log𝐵𝐵 − log𝐶𝐶 = 0.8923 →𝐵𝐵𝐶𝐶

= 100.8923 = 7.8037

The former subtraction indicates that the maximum likelihood is 2.6 times greater than the

likelihood at 𝛼𝛼 = 0, where the second term in Equation (6.37) indeed does not exist.

According to the latter subtraction, the maximum likelihood is approximately 8 times the

minimum likelihood. Both examples reflect that the presence of the second term (i.e., the 𝜓𝜓


196

term), in addition to the 𝜆𝜆 term, affords a higher accuracy and precision for identifying closest

comparables, estimating PDF90IPOI’s, and consequently, quantifying risk.

The descending trend of the log-likelihood function after 𝛼𝛼 ≅ 1.5 is another observation

worth accentuating. In the multi-dimensional space of operating and financial characteristics,

each firm can be represented by one point. Phase I maps such a point onto a new space where

it is represented with the new coordinates of 𝜆𝜆s and 𝜓𝜓. The similarity analysis is conducted in

this new space. As 𝛼𝛼 grows, it causes more emphasis to be placed on the 𝜓𝜓 term of the

similarity metric, diminishing the impact of the 𝜆𝜆 term which incorporates other dimensions of

the comparison. The process of comparable selection is, therefore, adversely affected by the

suppression of the role of 𝜆𝜆, leading to less accurate estimated PDF90IPOI’s and, consequently,

lower log-likelihoods.

To complete the discussion on Figure 8-7, the reader is provided a window into its

microstructure and the implication of the maximum likelihood estimate of 𝛼𝛼 in terms of the

accuracy of individual PDF90IPOI’s. Figure 8-8 serves as an example of the typical theme that

one encounters when plotting the PDF90IPOI’s produced under the maximum likelihood estimate

of 𝛼𝛼 (i.e., 𝛼𝛼 = 1.5) and the ones yielded using other assumptions of 𝛼𝛼 (e.g., 𝛼𝛼 = 0). As

depicted, a narrower PDF90IPOI is accomplished when 𝛼𝛼 = 1.5, which is more desirable since it

indicates less variance and more certainty about the value of 𝑉𝑉90.

A critical eye may point out that in Figure 8-8, the realized 𝑉𝑉90 (𝑣𝑣90realized) lands closer to

the mode of PDF90276thIPOI generated with 𝛼𝛼 = 1.5 and, hence, argue that the red curve is a

superior forecast since 𝑣𝑣90realized comes closer to its most likely value. This logic is

incompatible with the philosophy underlying the notion of likelihood function in statistics: it

does not matter how closely the realized quantity hits the mode. What matters is the

probability assigned to (or predicted for) the realized value. Drawing on this principle, in a

case such as Figure 8-9, the red curve would maintain its superiority although its mode is more

distant from the realized 𝑉𝑉90.


197

As a closing remark to this section, it is worthwhile to investigate the trend of the maximum

likelihood estimate of 𝛼𝛼 as the number of IPOIs participating in the calibration process

changes. Using all the 276 IPOIs, the above-indicated calibration intended to find a suitable

approximation for 𝛼𝛼 which can subsequently be presented as a property of the distance

function, eliminating the burden of the extra step of calibration in measuring the risk of other

IPOs. However, one may choose to calibrate the distance function each time it is engaged in

estimating the predictive density of a given IPOI. To elaborate, suppose that one aims at

quantifying the risk of the 100th IPO (i.e., the 49th IPOI). Drawing upon the outcomes shown

above, the analysis can be carried out by assuming 𝛼𝛼 is equal to 1.5 a priori.

Figure 8-8. The graph visualizes the two PDF90IPOI’s estimated for the 276th (last) IPOI under the two

assumptions of 𝛼𝛼 = 0 and 𝛼𝛼 = 1.5. A narrower PDF is achieved with the maximum likelihood estimate of 𝛼𝛼

(𝛼𝛼 = 1.5), which is more desirable since it indicates less variance and more certainty about the value of 𝑉𝑉90.


198

Alternatively, one can find the maximum likelihood estimate of 𝛼𝛼 for this specific problem.

The latter approach would require repeating all the aforementioned calibration steps except

that under each assumption of 𝛼𝛼, there exist only 48 PDF90IPOI’s (and the respective 48

𝑣𝑣90realized’s), as opposed to 276 PDF90IPOI’s (and the corresponding 𝑣𝑣90realized’s).

Figure 8-9. Similar to Figure 8-8, this plot visualizes the two estimated PDF90IPOI’s for the 17th IPOI when 𝛼𝛼 =

0 and 𝛼𝛼 = 1.5. The graph serves as an example of the case where the maximum likelihood estimate of the

parameter (𝛼𝛼 = 1.5) has produced a narrower PDF; yet, the mode of the PDF is comparatively remote from

the realized price, marked by the dashed line. Such an observation should not be viewed as an element of

concern about the accuracy of the PDF since the concept of maximum likelihood rests the basis of accuracy

comparison upon examining the assigned probability to the realized event, and not upon the deviation of the

realized event from the most likely value (mode) of the PDF.


199

Figure 8-10 summarizes the calibration results produced under varying sets of engaged

IPOs (i.e., PDF90IPOI’s). Sensibly, in the case of the first IPOI (i.e., the 52nd IPO), the distance

function cannot be calibrated since there exists no preceding IPO with a predictive density. It

should be noted that the set of 𝛼𝛼 values, to act as the potential candidates for the maximum

likelihood estimate of 𝛼𝛼, has remained intact which implies that the vector of 𝛼𝛼 in Figure 8-10

is identical to the one used for Figure 8-7. This restriction arises from the computational

tedium of estimating PDF90IPOI’s under a given 𝛼𝛼. Since the PDF90IPOI’s have been computed and

stored for all the elements of the macro set of 276 IPOIs, under individual 𝛼𝛼 values displayed

in Figure 8-7, one can benefit from them to straightforwardly run the calibration process using

subsets of the macro set; otherwise, for any new candidate of 𝛼𝛼, one ought to first estimate the

respective PDF90IPOI’s which considerably increases the execution time.

Figure 8-10 demonstrates more fluctuations for the early subsets, where the calibrations

were conducted on the basis of smaller groups of IPOs, and as the number of participating

IPOs increases, the value of the maximum likelihood estimate of 𝛼𝛼 approaches 1.5. The rather

sharp convergence can be attributed to the limited number of the 𝛼𝛼 candidates available for the

analysis. Note that the two 𝛼𝛼 candidates at the vicinity of 𝛼𝛼 = 1.5 are 𝛼𝛼 = 1.25 and 𝛼𝛼 = 1.7,

which have evidently been dominated by 𝛼𝛼 = 1.5. Perhaps a smoother convergence behaviour

would be recorded if a higher resolution of 𝛼𝛼 were computationally affordable.

To support the stability of the convergence behaviour depicted in Figure 8-10, this thesis

draws upon a well-known theorem in estimation theory: Cramer-Rao Lower Bound (CRLB)

(Panik, 2005 [106]; Hardle and Simar, 2012 [67]). It can be used to compute the lower bound for

the variance of the posited maximum likelihood ‘estimator’ of 𝛼𝛼0F

1. A desirable property of any

maximum likelihood estimator is that in limit (i.e., for asymptotically large samples), it would

achieve the lower bound given by the Cramer-Rao theorem (Kay, 1993 [83]). In this theorem,

the variance is expressed as the inverse of the expected value of the Hessian matrix of the

likelihood function with respect to the corresponding parameter(s) (in this case, 𝛼𝛼). 1 The author is grateful to Dr. Yuri Lawryshyn and his PhD student, Ali Bashiri, for bringing this theorem to

her notice.


200

A bootstrapping process was conducted to find the expected value of the Hessian matrix. In

each bootstrapping cycle, the requisite log-likelihood arrays for individual 𝛼𝛼’s were generated

by resampling with replacement from the corresponding original log-likelihood sets. The

obtained variances, under 102, 103, and 106 cycles of bootstrapping, can sequentially be listed

as follows: 0.0153, 0.0147, and 0.0150, which respectively translate into the standard

deviations 0.1238, 0.1211, and 0.1224. Drawing upon the depicted consistency between the

results, the standard deviation can be assumed to be approximately equal to 0.12. The reported

standard deviation specifies the long-run deviation of the proposed maximum likelihood

estimate of 𝛼𝛼 from its true value. In other words, for an asymptotically large sample of IPOs,

the expected margin of error on the maximum likelihood estimate of 𝛼𝛼 is given by CRLB.

Noticeable in Figure 8-10 is a small cluster of size 9 where the individual maximum

likelihood estimates of 𝛼𝛼 are reported to be 0.5. Within this cluster, individual calibrations are

carried out using the following subsets, respectively: 52nd IPO to 141st IPO, 52nd IPO to 142nd

IPO, …, 52nd IPO to 149th IPO. A more detailed investigation of the corresponding log-

likelihood functions reveals that 0.5 indeed represents the ‘argument’ of a local maximum,

which either wins the position of the global maximum with a slight margin over 1.5 (see,

Figure A-7), or holds this position jointly with 1.5 (see, Figure A-8). Thus, although it first

appeared to be an anomaly, the maximum likelihood estimate of 0.5 is indeed consistent with

the rest of the results exhibited by Figure 8-10. Appendix A further elucidates this point and

summarizes the outcomes of this side analysis. Based on the scrutiny of the underlying log-

likelihood functions, it is concluded that the local maximum at 0.5 is ‘unstable’ and will be

dominated by the one at 1.5 once the subset size rises beyond a threshold (in this problem,

148).

In substance, Figure 8-10 depicts that in the majority of the cases, the maximum likelihood

estimate of 𝛼𝛼 falls within the range 1.25 ≤ 𝛼𝛼 ≤ 1.9, and as the pool of engaged IPOs grows in

size, 𝛼𝛼 = 1.5 is persistently selected as the maximum likelihood estimate of 𝛼𝛼. It is, therefore,

concluded that 𝛼𝛼 = 1.5 presents a reasonable approximation for the coefficient of the 𝜓𝜓 term

in the distance function and can be utilized to skip the extra step of calibration.


201

As a final consideration on the topic of calibration, it is worth noting that one must

distinguish between the ‘relative’ and ‘standalone’ accuracy of individual sets of PDF90IPOI’s,

estimated using different assumptions of 𝛼𝛼. A set of PDF90IPOI’s, produced under a given value

of 𝛼𝛼, may collectively be more accurate compared to another set of the same size, generated

using a different assumption of 𝛼𝛼; however, regardless of the relative superiority or inferiority

in accuracy, both sets may be inaccurate (or accurate). The next section (Section 8.2.4)

provides a comprehensive discussion on how one can examine the standalone accuracy of a set

of PDF90IPOI’s and reports the outcomes of the tests for the set of 276 PDF90IPOI’s obtained under

the maximum likelihood estimate of 𝛼𝛼 (i.e., 𝛼𝛼 = 1.5). Appendix A presents the outputs of

these tests for the cases where 𝛼𝛼 assumes a value from the following set: [0, 0.5, 1.25, 1.9, 3,

4, 5]. In concise terms, the results indicate that if one selects an 𝛼𝛼 value remote from 1.5,

namely 0, 0.5, 1.9, or ≥ 3, and subsequently estimates the PDF90IPOI’s of all the 276 IPOIs, the

resultant PDF90IPOI’s would fail to pass the accuracy tests. The implication of the outcomes of

the tests is two-fold: They provide evidence on the degree of the standalone accuracy of the

individual sets of PDF90IPOI’s, produced under different assumptions of 𝛼𝛼. In addition, they

signify the importance of the role of 𝛼𝛼 in the proposed risk methodology, through

demonstrating how changes in the value of 𝛼𝛼 can influence the standalone accuracy of the

corresponding sets of estimated PDF90IPOI’s.


202

Figure 8-10. This graph exhibits the outcomes of the calibration process for varying sets of engaged IPOs.

As displayed on the horizontal axis, the set producing the first point on the plot consists of the 52nd IPO,

only. The second point on the figure is generated using the set comprising the 52nd and 53rd IPOs. In a like

fashion, the remaining points are obtained by sequentially increasing the size of the set of participating IPOs

by one. For each set, the maximum likelihood estimate of 𝛼𝛼 is found through repeating the maximum

likelihood calibration process described in the text. As a case in point, consider the label 52-100, on the

horizontal axis; it implies that the corresponding calibration process was carried out on the basis of the

PDF90IPOI’s estimated for the 52nd IPO to the 100th IPO, or equivalently, the 1st IPOI to the 49th IPOI, under the

individual 𝛼𝛼 values depicted on the vertical axis. For this particular set of length 49, the likelihood function

reaches its maximum at the value of 𝛼𝛼 = 1.25; that is, the maximum likelihood estimate of 𝛼𝛼 is equal to

1.25. This figure demonstrates that despite early fluctuations in the value of the maximum likelihood

estimate of 𝛼𝛼, the trend remains steady at 𝛼𝛼 = 1.5 once the number of participating IPOs rises beyond a

certain threshold (i.e., 148).


203

8.2.4. Scrutiny of Estimated Probability Density Functions

Scrutinizing the accuracy of the estimated PDF90IPOI’s is undoubtedly the most important

piece of the puzzle that this research endeavours to solve. The calibration process pursues the

goal of finding a parameter estimate (i.e., 𝛼𝛼 = 1.5) under which the probability of the realized

prices on day 90 is maximum. Hence, its outcome can merely be interpreted in terms of the

‘relative’ accuracy of the PDFs; that is, the PDFs produced using the assumption of 𝛼𝛼 = 1.5

are typically more accurate compared to the ones yielded under other values of 𝛼𝛼. However,

such statement does not shed any light on the ‘standalone’ accuracy of the PDFs generated

under the assumption of 𝛼𝛼 = 1.5. This section aims at tackling this aspect.

Backtesting forms the basis of any attempt to examine the proposed methodology since past

events are valuable sources of knowledge and the success of the methodology in achieving its

objectives can be judged by analyzing its performance under historical scenarios. The topic of

scrutiny is approached from three different angles which are discussed successively in separate

sections; i.e., Sections 8.2.4.1, 8.2.4.2, and 8.2.4.3.

8.2.4.1. Test of Uniform Cumulative Probabilities

The proposed methodology yields a specific PDF90IPOI for each IPOI, and only is a single

observation available for each PDF90IPOI; that is, the respective realized price on day 90

(𝑣𝑣90realized). Suppose for a moment that instead of 276 individual PDF90IPOI’s and 276 𝑣𝑣90realized’s,

there were a single estimated PDF, referred to as PDF, and 276 samples deemed to be

random draws from PDF. One straightforward approach to verify the correctness of this

claim would be gauging the deviation of the empirical PDF, built upon the 276 samples, from

the hypothesized PDF. If the deviation is proven to be insignificant, then the original claim

cannot be rejected.

This research, however, faces a different setting: here, one attempts to examine the

accuracy of a methodology (a black box, so to speak) that produces different PDFs, for each of


204

which only a single value would subsequently be realized. The test of uniform cumulative

probabilities is an approach that can be employed in such a case in order to examine the

validity of the methodology, and consequently its outcomes. This approach draws on a concept

referred to as ‘probability-integral transform’ which is proven below for the specific problem

at hand. For an extended elaboration, an interested reader may refer to statistics texts such as

Gibbons and Chakraborti (2005) [58]. By means of this approach, one can restructure the

problem such that it resembles the simple above-indicted example where a single PDF (PDF)

was involved.

For each IPOI, its 𝑣𝑣90realized is associated with a cumulative probability value (CPV) which

is computed based on the corresponding estimated PDF90IPOI. To clarify further, CPV can be

expressed as follows:

CPV𝑖𝑖 = P�𝑉𝑉90 ≤ 𝑣𝑣𝑖𝑖,90

realized�

𝑗𝑗 = 1, … ,276 (8.9)

where, 𝑗𝑗 designates the ID of the IPOI under consideration and ranges from 1 to 276. The

variable 𝑣𝑣𝑖𝑖,90realized denotes the realized price of the 𝑗𝑗th IPOI on day 90.

The primary step of the test of uniform cumulative probabilities involves computing CPVs

for all the 276 IPOIs. The test indicates that if the methodology has produced

accurate PDF90IPOI’s in all the 276 cases, the empirical PDF of the computed 276 CPVs must

resemble the uniform PDF spanning the interval of [0,1]. To phrase more precisely, the

methodology would pass the accuracy test if it is proven that the observed set of 276 CPVs has

been drawn from a reference population with uniform density between 0 and 1. The following

proposition elucidates the origin of this test:


205

Proposition 1.

CPV is a random variable which is uniformly distributed over the interval of [0,1].

Proof.

The following lemma must first be established in order to accomplish the proof of

Proposition 1.

Lemma 1.

Suppose 𝑋𝑋 and 𝑌𝑌 are random variables whose PDFs are denoted by 𝑓𝑓𝑋𝑋(𝑥𝑥) and

𝑓𝑓𝑌𝑌(𝑦𝑦), respectively. Regard 𝑌𝑌 as a function of �𝑌𝑌 = ℊ(𝑋𝑋)�. The PDF of 𝑌𝑌 can,

therefore, be expressed as below:

𝑓𝑓𝑌𝑌(𝑦𝑦) =𝑓𝑓𝑋𝑋�ℊ−1(𝑦𝑦)�ℊ′�ℊ−1(𝑦𝑦)�

(8.10)

Proof.

𝑦𝑦 = ℊ(𝑥𝑥)

→ � 𝑥𝑥 = ℊ−1(𝑦𝑦)

d𝑦𝑦 = ℊ′(𝑥𝑥)d𝑥𝑥𝑓𝑓𝑌𝑌(𝑦𝑦)d𝑦𝑦 = 𝑓𝑓𝑋𝑋(𝑥𝑥)d𝑥𝑥

(1)(2)(3)

(2)�� d𝑥𝑥 =

d𝑦𝑦ℊ′(𝑥𝑥)

(1)�� d𝑥𝑥 =

d𝑦𝑦ℊ′�ℊ−1(𝑦𝑦)�


206

(1),(2)→(3)�⎯⎯⎯⎯⎯⎯� 𝑓𝑓𝑌𝑌(𝑦𝑦)d𝑦𝑦 =

𝑓𝑓𝑋𝑋�ℊ−1(𝑦𝑦)�d𝑦𝑦ℊ′�ℊ−1(𝑦𝑦)�

■

In order to prove proposition 1, once again assume that 𝑋𝑋 is a random variable with a

PDF represented by 𝑓𝑓𝑋𝑋(𝑥𝑥). Suppose 𝑥𝑥� is a sample drawn from 𝑓𝑓𝑋𝑋(𝑥𝑥). Note that prior to

sampling, 𝑥𝑥� was itself a random variable which is equal to 𝑋𝑋 in distribution. Denote

this random variable as 𝑋𝑋�. It can readily be perceived that the cumulative probability

value of 𝑋𝑋� is likewise a random variable. Designate this latter random variable by 𝑌𝑌

which can mathematically be expressed as 𝑌𝑌 = Ƒ𝑋𝑋(𝑋𝑋�), where Ƒ denotes the continuous

cumulative density function (CDF) of 𝑋𝑋. The PDF of 𝑌𝑌 can then be found using

Lemma 1 (see, Equation (8.10)):

𝑓𝑓𝑌𝑌(𝑦𝑦) = 𝑓𝑓𝑋𝑋�Ƒ𝑋𝑋−1(𝑦𝑦)�Ƒ𝑋𝑋 ′ �Ƒ𝑋𝑋−1(𝑦𝑦)�

(8.11)

= 𝑓𝑓𝑋𝑋(𝑥𝑥�) Ƒ𝑋𝑋 ′ (𝑥𝑥�)

= 𝑓𝑓𝑋𝑋(𝑥𝑥�) 𝑓𝑓𝑋𝑋(𝑥𝑥�)

= 1

According to Equation (8.11), 𝑓𝑓𝑌𝑌(𝑦𝑦) is equal to unity which implies that it is a

uniform over a range of width 1. Considering that 𝑌𝑌 represents a cumulative

probability value, it must be non-negative and less than, or equal to, unity. Thus, the

uniform density, 𝑓𝑓𝑌𝑌(𝑦𝑦), spans the interval of [0,1].


207

In the context of this thesis, CPV, defined by Equation (8.9), acts as the

counterpart of 𝑌𝑌. Likewise, 𝑣𝑣90realized can be regarded as the analogue of 𝑥𝑥�. Thus, prior

to its realization, it is considered a random variable which can be denoted by 𝑉𝑉�90. In

the light of this analogy, it is clear that akin to the cumulative probability of 𝑋𝑋� which

is a random variable with the uniform PDF on [0,1], the cumulative probability value

of 𝑉𝑉�90 (i.e., CPV) is uniformly distributed over the unit interval of [0,1] .

▄

Proposition 1 proved that CPV is a uniform random variable on [0,1]. Thus, each CPV𝑖𝑖 ,

computed based on Equation (8.9), is indeed a sample from the uniform PDF on [0,1]. The

problem has now boiled down to a familiar one: There exists a theoretical PDF, which is

the uniform on [0,1], and a sample set of length 276, which encompasses the observed

CPVs. If the proposed methodology is estimating the PDF90IPOI’s accurately, the empirical

PDF constructed on the basis of the observed CPVs must resemble the uniform PDF on

[0,1]. The next subsection focuses on the scrutiny of this aspect.

8.2.4.1.1. Non-Parametric Statistical Tests to Examine the Uniformity Assumption

Figure 8-11 presents the histogram of the 276 CPVs computed using Equation (8.9). The

range of each bin can be regarded as one ‘event’. According to Proposition 1, in theory, these

events must exhibit equal frequency; that is, the sample data must be distributed evenly

between the bins. A visual inspection indicates that there exists a sufficient resemblance

between the shape of this histogram and a theoretical discrete uniform distribution over the

range of [0,1]. However, one may turn to statistical tests as firm bases for deriving substantive

and solid conclusions about the uniformity of the CPVs, and consequently, the accuracy of the

proposed methodology for estimating PDF90IPOI. The following hypothesis testing is, therefore,

conducted:

H0: The observed set of 276 CPVs has been drawn from a reference population with

uniform density between 0 and 1.


208

H1: The observed set does not come from the reference density indicated under H0.

Two non-parametric tests are adopted to execute the hypothesis testing:

(i) Pearson’s chi-square test to assess ‘goodness of fit’

(ii) One-sample Kolmogorov-Smirnov test (K-S test)

Pearson’s chi-square test to assess ‘goodness of fit’:

In the case of the former test, Pearson’s chi-square test, the null hypothesis can more

precisely be stated as follows: the observed sequence of 276 CPVs has been drawn from a

population where all mutually exclusive events occur with equal frequency. As a case in point,

consider the histogram depicted by Figure 8-11. The range of each bin on the x-axis marks an

event which is mutually exclusive with the others, and the frequency of each event is defined

as the portion of the total observations it holds. The histogram in Figure 8-11 consists of 20

events. As an example, 0.05 < CPV≤ 0.1 outlines one event whose observed and theoretical

frequencies can respectively be calculated as 14/276 = 0.0507 and 1/20 = 0.05.

At par with the majority of statistical tests, Pearson’s chi-square test defines a test statistic

which performs as a measure of similarity between the observed sample set and the reference

or theoretical set. The test statistic resembles ‘sum of squared errors’ and is obtained by first

computing individual differences between the observed frequency of each event and its

expected frequency in the reference or theoretical uniform histogram. For the example

indicated above, the difference would be (0.0507 − 0.05). The squared differences are then

summed and normalized with the theoretical frequency in order to achieve the test statistic.


209

It has been proven that under the null hypothesis, the test statistic follows a chi-square

distribution whose degree of freedom is equal to the number of events (bins) reduced by one

since there exists one constraint; that is, the sum of frequencies equals unity. Once the test

statistic is computed, one must refer to the chi-square distribution with the respective degree of

freedom and fetch the probability of observing a value equal to, or greater than, the test

statistic. This probability is referred to as p-value. A p-value greater than a pre-specified

significance level, Ƿ, would indicate that there is not a significant evidence against the null

hypothesis. Hence, H0 cannot be rejected at the significance level of Ƿ. For practical purposes,

Ƿ is often set to be 0.05. Further details regarding Pearson’s chi-square test can be accessed in

Figure 8-11. The figure presents the histogram of the CPVs computed for the 276 IPOIs (i.e., 52nd IPO to

327th IPO), using Equation (8.9).


210

the vast literature addressing this topic. The author has benefited from Panneerselvam (2004)

[107] and Altiok and Melamed (2007) [6].

One-sample Kolmogorov-Smirnov test (K-S test):

Evidently, Pearson’s chi-square test is sensitive to the classification of the events, or

simply, the number of bins. Consequently, it would generate the most reliable outcomes when

data are abundant. The K-S test offers an alternative approach which resolves this issue. It

does not rely on a histogram (or PDF); rather, it compares the empirical CDF with the

reference CDF, and hence, is capable of coping with smaller samples.

In its ‘one-sample’ case, the K-S test is well-known for its capacity to examine whether a

sample comes from a normal distribution; yet, the theory is applicable to any reference

distribution, including uniform distributions. The null hypothesis of the one-sample K-S test

can more precisely be phrased as follows: the empirical CDF of the sample ‘agrees’ with the

CDF of the reference density. The null hypothesis is tested against the alternative hypothesis

according to which, the empirical CDF of the sample does not agree with the reference CDF.

The K-S test statistic is defined as the maximum absolute distance between the reference

CDF and the empirical CDF. The test statistic indeed acts as a measure to gauge the goodness

of fit, and as inferred from the definition, smaller values are more desirable. If ň denotes the

total number of observations and the test statistic is referred to as Šň, under the null hypothesis,

√ňŠň converges in distribution to the ‘Kolmogorov’ distribution. An ‘insignificant’ p-value

would cast doubt on the validity of the null hypothesis, and if the pre-defined significance

level (e.g., Ƿ = 0.05) exceeds the p-value, the null hypothesis would be rejected. The reader is

encouraged to consult texts of statistics for further information on the K-S test (e.g.,

Panneerselvam, 2004 [107]; Altiok and Melamed, 2007 [6]).


211

Test Results:

The following p-values were obtained for the realized set of 276 CPVs:

Pearson’s chi-square test: p-value = 0.9710

K-S test: p-value = 0.8893

Both p-values are satisfactorily large and well beyond the significance levels set in practice.

The null hypothesis is, therefore, accepted from which it follows that the proposed

methodology (i.e., Phase I and Phase II) has accomplished accurate and reliable PDFs.

8.2.4.2. Test of VaR Break Frequency

Due to its comprehensiveness, the test of uniform CPVs alone would suffice to establish the

accuracy of the estimated PDFs, and hence, the subsequent inferences; however, putting that

aside, the methodology can further be scrutinized by examining the realized frequency of

‘VaR breaks’. Below, it is expounded why this test offers a suitable means to evaluate the

suggested methodology.

The ultimate goal of this research is to offer a methodology for quantifying the short-term

risk of IPOs. This objective is accomplished by devising a VaR approach which computes a

90-day VaR at the confidence level of 100 × (1 − ῤ)%. (e.g., 95% if ῤ = 0.05). Recall from

Section 5.1 that a 90-day VaR represents a threshold price (or gain-loss) which can be

interpreted as follows: with the probability of (1 − ῤ) (e.g., 0.95), the price of day 90 will not

descend below the level designated by VaR. The process of estimating PDF90IPOI’s, in its

entirety, can be regarded as a black box. For each IPOI, this black box produces a PDF which

would then act as the basis of the respective VaR analysis. VaR is essentially a quantile and its

accuracy is contingent upon, and attributed to, the accuracy of the underlying PDF (in this

case, PDF90IPOI). Therefore, in examining a VaR system, it is indeed the underlying PDF which

is assessed.


212

Examining the frequency of ‘VaR breaks’ is a prevalently-used ‘backtesting’ approach to

evaluate various VaR models (e.g., Prem et al., 2010 [113]; Pinedo and Walter, 2013 [112] ). If for

a given case, the estimated VaR threshold exceeds the realized price (or gain-loss), a VaR

break has taken place. To warrant reliance, the ‘realized’ frequency of VaR breaks under a

given VaR approach is expected to be at par with the ‘nominal’ frequency of VaR breaks (i.e.,

ῤ).

The essence of the notion of ‘VaR break frequency’ rests on the frequency interpretation of

probability. Suppose that there exists an accurate VaR system which produces 95% VaRs on a

daily ‘time horizon’. If this system is used every day to obtain a VaR estimate for the next day,

the long-run frequency of VaR breaks would converge to 5%. In the context of this research,

the time horizon of the VaR system is 90 days and it has generated 276 VaR estimates. Table

8-1 reports the realized VaR break frequencies under the nominal frequencies of ῤ = 5%,

ῤ = 10%, ῤ = 15%, and ῤ = 20%. As presented, the computed VaR break frequencies

closely resemble their nominal counterparts which attests to the validity of the proposed

methodology (black box) of this thesis. Note that due to the relatively small size of the sample

(i.e., 276 VaR estimates), one would not expect the realized frequencies to be exactly at par

with the nominal frequencies since the latter indeed denote long-run frequencies.

8.2.4.3. Test of Impact of Comparable Selection and Bayesian Updating

on Estimated PDFs

The two tests presented in the previous sections demonstrated the accuracy of the proposed

methodology of this thesis (i.e., Phase I and Phase II). There still remains one more avenue

worth exploring; that is, the impact of incorporating the two steps of comparable selection and

Bayesian updating on the quality of the estimated PDF90IPOI’s. These two steps have

undoubtedly enhanced the mathematical rigor of achieving the PDF90IPOI for a given IPOI. The

question is what their contribution has been to the accuracy of the final outcomes. This section

aims at addressing this aspect and inspects the effectiveness of these two steps in delivering

the mission of this research.


213

For the sake of fluency, it is advantageous to first review the logic underlying the

comparable selection and Bayesian updating steps. This thesis pursued a methodology to

estimate the probability density function of a short-term price or return for a given IPOI. This

task was, however, encumbered with lack of sufficient historical evidence and knowledge – an

inherent characteristic of most IPOs. To cope with this challenge, the closest comparable of

each IPOI was appointed to compensate for the requisite, yet non-existent, data. Though, the

impact of the closest comparable was kept to a minimum through only adopting its joint

density of parameters. Recall that this joint density was indeed in its posterior form; the

Bayesian approach assisted with this step and served to refine and revise the prior joint density

of the closest comparable before it was dispensed to the given IPOI.

Table 8-1. For each nominal frequency (ῤ) on the leftmost column, the rightmost counterpart presents the

realized frequency which is computed as the quotient of the middle column over the total number of the

IPOIs (i.e., 276).

Nominal

Frequency of

VaR Breaks (ῤ)

VaR Breaks

(Number

out of 276)

Realized

Frequency of

VaR Breaks

0.05 16 0.0580

0.1 30 0.1087

0.15 41 0.1486

0.2 55 0.1993


214

In order to demonstrate the significance of the steps of comparable selection and Bayesian

updating, it is next investigated how the methodology would perform if they were excluded. It

would shrink into the following steps:

First.

For a given IPO (i.e., IPO𝑖𝑖, where 𝑘𝑘 = 52 , …, 327), there exist (𝑘𝑘 − 1) preceding realized

IPOs; i.e., when analyzing IPO𝑖𝑖, none of the succeeding IPOs has yet taken place. This set

of length (𝑘𝑘 − 1) is utilized to obtain the respective uniform joint density of the

parameters (i.e., 𝑓𝑓Μ,Σ,Ξ,Ω(𝜇𝜇,𝜎𝜎, 𝜉𝜉,𝜔𝜔)), following the same steps used to estimate the uniform

prior joint density for the ‘pioneers’ in Section 8.2.2. The only minor modification is that

here, the uniform joint density is obtained using all of the (𝑘𝑘 − 1) former IPOs, as

opposed to using the pioneers only. The process is sophisticated and hence the brevity of a

short review may cause some confusion. The reader is, therefore, encouraged to consult

Section 8.2.2 for the specifics.

Second.

The first step yields individual uniform joint densities for the IPOs on the list. The

respective uniform joint density of each IPO serves as the joint density required in

Equation (6.42) to compute PDF90IPOI. Note that for a given IPO, its uniform joint density

was obtained using all of its preceding IPOs, with no reference to the closest comparable.

Further note that the respective uniform joint density was directly plugged into Equation

(6.42) without undergoing any Bayesian updating process.

The new methodology implicitly assumes that there exists a theoretical global joint density of

the parameters and attempts to approximate it by an empirical global joint density. For each

IPOI, this empirical joint density is developed using the ‘sample events’ dispensed from all of

the preceding realized IPOs; in the case of each former IPO, the combination of the confidence

intervals of its parameters represents one ‘sample event’ drawn from the theoretical global

density. Such logic is frequently deployed in empirical studies. Ruud (1993) [125] and Asquith


215

et al. (1998) [12] can be listed as examples in the context of IPOs. In contrast to the new

methodology, the methodology proposed in Phase I and Phase II associates specific joint

densities with individual IPOIs and assumes that the joint density of a given IPOI is unique

and can only be approximated by the posterior joint density of its closest comparable.

The outcomes of the new methodology are evaluated based on the same tests used to

examine the results of the former methodology:

(i) test of uniform cumulative probability values (Section 8.2.4.1)

(ii) test of VaR break frequency (Section 8.2.4.2)

For the sake of clarity, henceforth, the new methodology, which excludes the two steps of

comparable selection and Bayesian updating, is referred to as ‘simplified’, and the former

methodology, which preserves the two steps, is called ‘comprehensive’.

Test of Uniform Cumulative Probability Values

Figure 8-12 is the counterpart of Figure 8-11, plotted on the basis of the outcomes of the

‘simplified’ methodology. The histogram does not evoke the shape of a uniform distribution;

yet, the nonparametric tests of K-S and Pearson’s chi-square were employed to test the null

hypothesis that ‘the observed set of 276 CPVs has been drawn from a reference population

with uniform density between 0 and 1’. The tests would offer some perspective on the

probability of the observed sample set coming from the uniform distribution of [0,1] (see,

Section 8.2.4.1.1). The following p-values were yielded by the tests:

Pearson’s chi-square test: p-value = 1.86 × 10−6

K-S test: p-value = 0.00024

which provide no significant evidence to accept the null hypothesis. It can, therefore, be

inferred that the ‘simplified’ methodology fails to estimate PDF90IPOI’s accurately which, in

consequence, would adversely impact the accuracy of the VaR outcomes, as shown next.


216

Test of VaR Break Frequency

Table 8-2 presents the results of the analysis of VaR break frequency. The realized frequencies

fall inferior to the nominal frequencies, to a great extent. This observation implies that the

VaR inferences, developed on the basis of the ‘simplified’ methodology, overestimate the risk

at all the four confidence levels of 95%, 90%, 85%, 80%.

The outcomes of the test of VaR break frequency can further be supported by a closer

inspection of Figure 8-12. It can be deduced from this figure that the PDF90IPOI’s estimated

Figure 8-12. This figure is produced following the same steps used for Figure 8-11. The only difference

remains the underlying methodology which generates the CPVs. Figure 8-11 depicts the histogram of the

CPVs computed using the ‘comprehensive’ methodology; whereas, here, the graph is formed on the basis of

the CPVs resulted from the ‘simplified’ methodology.


217

using the ‘simplified’ methodology associate comparatively greater probabilities with the

rare events on the left tail of the distributions (e.g., 0.1 < 𝑉𝑉90𝑣𝑣offer

≤ 0.1 + d𝑣𝑣90𝑣𝑣offer

). Since VaR, at

any confidence level, is in essence a quantile, a ‘heavier’ left tail would yield relatively low

(i.e., pessimistic) VaR thresholds which are hard to break. The comparatively lower numbers

of VaR breaks in Table 8-2 attest to this feature.

The accuracy of the outcomes of the ‘comprehensive’ methodology was previously

established (see, Sections 8.2.4.1 and 8.2.4.2). The analysis conducted in this section

demonstrates that not only does the ‘comprehensive’ methodology yield reliable outcomes, but

also all the steps it engages to obtain such outputs are essential and effective. Excluding the

two steps of comparable selection and Bayesian updating would lead to a VaR model which

Table 8-2. This table can be regarded as the counterpart of Table 8-1. The latter reports the realized VaR

break frequencies under the ‘comprehensive’ methodology; whereas, this table exhibits the outcomes of the

test of VaR break frequency for the ‘simplified’ methodology.

Nominal

Frequency of

VaR Breaks (ῤ)

VaR Breaks

(Number

out of 276)

Realized

Frequency of VaR

Breaks under the

‘Simplified’

Methodology

0.05 3 0.0109

0.1 9 0.0326

0.15 14 0.0507

0.2 23 0.0833


218

overestimates risk. Pessimism bias in risk estimations would deter the participation of

investors, even the ones with a moderate or high tolerance for risk, or may impose excessive

hedging requirements, leading to waste of resources.

219

Chapter 9. CONCLUSIONS AND FUTURE WORK

9.1. In Conclusion: IPO Risk Analysis Unravelled Using

the Bayesian Perspective

As the closing remarks of this thesis, it is worth reiterating that the focal intent of this

research was to design and backtest a methodology for quantifying the short-term risk of IPOs.

Devising a risk methodology for IPOs – which would yield a mathematically literate

interpretation of risk – was often burdened by the scarcity of data. This thesis commenced

with the objective of solving this problem. It was founded on a foresight that one potential

solution lies in developing a methodology which would bridge the two familiar notions

indicated below.

First Notion. Comparable-based methods are prevalently utilized in various analysis

domains. It was established in Chapter 3 why founding the risk analysis on comparables

offers a promising approach to quantify the market price-based risk of IPOs. Therein,

evidence was provided for the predominance of the comparable-based approach among

analysts and its potentially significant role on market prices. It was, therefore, concluded

that a comparable-based approach can act as the building block of the risk assessment of

IPOs.

Second Notion. The Bayesian framework offers a unique setting for risk assessments in

various fields including finance. It is particularly advantageous when there exist limited

data, yet, strong prior belief – the two characteristics inherent in most IPOs. The capacity

of the Bayesian approach to measure the ‘epistemic’ component of uncertainty, in addition

to its ‘aleatory’ constituent, is another feature that notably contributes to its reputation as a

suitable framework to pursue risk analyses (see, Section 5.5).

In an environment of data paucity, the first notion would assist with identifying the most

suitable source of prior knowledge (i.e., the closest comparable). The second notion offers a

Chapter 9. Conclusions and Future Work 9.1. In Conclusion: IPO Risk Analysis Unravelled Using the Bayesian Perspective

220

setting where the prior belief can be formulated and refined to be utilized in the risk

assessment of a given IPO. The following remarks further elaborate on the function of the

second notion in fulfilling the objectives sought by this thesis.

In this study, the stochastic process of price is modeled by a GBM with indeterministic

parameters. Despite the frequentist school of thought, the Bayesian framework views

parameters as random variables whose uncertainty is of the ‘epistemic’ type. Such setting

would allow one to structure the joint density required to describe the probabilities of all

possible combinations of the model parameters. Subsequently, one can proceed with

estimating the predicative density of price (i.e., PDF90IPOI) by averaging over all combinations

of the parameters with individual weights set as the probabilities of observing the vicinity of

the respective combinations (see, Section 5.5). The predictive density yielded by this approach

has indeed hybridized parameter uncertainty and data uncertainty (i.e., both types of

uncertainty) which leads to inferences like VaR that can be viewed as comprehensive

concepts, simultaneously addressing all sources of uncertainty. Thus, not only does this study

offer a novel systematic VaR methodology to quantify risk of IPOs, but by virtue of the latter

feature, it further serves to enhance the accuracy of the proposed VaR approach; that is, in

estimating the uncertainty of price at a specified short-term horizon, it recognizes that the

definiteness of price is blurred by the epistemic uncertainty, in addition to the aleatory

uncertainty, and hence, proceeds with unraveling the IPO risk analysis using the Bayesian

VaR wisdom.

The risk methodology of this thesis comprises two phases (i.e., Phase I and Phase II) and

addresses the aforementioned two notions as described below.

The first notion is handled in Phase I (see, Section 6.2) which yields two significant

contributions: The primary contribution is a DEA-based multi-dimensional ‘similarity’

metric, developed for selecting the closest comparable (i.e., the most suitable source of

prior knowledge). In working out the details of the similarity metric and in searching for a

DEA super-efficiency model which would satisfy the specific features and constraints of


221

the problem under consideration, an opportunity for a new model was recognized. This new

DEA model of super-efficiency forms the secondary contribution of Phase I. It is

worthwhile to note that Phase I has been developed within the framework of DEA;

however, despite its conventional usage as a tool of efficiency analysis, DEA here serves as

a mapping means. In the multi-dimensional space of operating and financial characteristics,

each firm can be represented by one point. DEA maps this point into a new space whose

coordinates are DEA parameters. The mapping facilitates multi-dimensional comparisons

between firms which then lead to the selection of closest comparables.

The second notion is addressed in Phase II (see, Section 6.3) which focuses on the chief

contribution of this thesis; that is, measuring the short-term risk of IPOs in a Bayesian

setting.

Subsequent to designing the methodology, the validation studies were conducted. The

suggested methodology was implemented and examined by means of approximately 11000

lines of program code in MATLAB (a trademark of MathWorks®). The process of data

mining returned 327 U.S. IPOs (1990-2012) in the ‘health-care’ sector which were engaged

towards this goal (see, Chapter 7). Below, the major pieces of the scrutiny work and the

respective outcomes are concisely reviewed (see, Chapter 8 for more details).

Piece 1: Phase I

Phase I is implemented in three different settings, referred to as RUN 1, RUN 2,

and RUN 3. The latter – where both steps of efficient IPO treatment (see, Section

6.2.4) and outlier detection (see, Section 6.2.5) are executed – offers a more

theoretically coherent structure and consequently leads to outcomes more consistent

with practical experience. It was, therefore, chosen for the rest of the analyses.

In Section 8.1, the robustness of the comparable selection method is demonstrated

through three different qualitative evaluations: (i) inspection of the number of

comparables associated with individual IPOIs, (ii) scrutiny of ‘average-union’


222

ratios, (iii) systematic examination of the changes in the composition of the set of

comparables for a given IPOI, induced by sequentially enhancing the ‘pool of

candidates’ of the given IPOI through adding a new IPO to its pool in each

sequence. It was established that the composition of the set of comparables for a

given IPOI tends to remain intact as the respective pool of candidates grows in size.

The composition may only change with the entry of a new IPO which, upon its

arrival, would replace an existent comparable. This observation attests to the power

of Phase I in finely selecting the comparables which play a key role in Phase II.

Piece 2: Phase II and Model Calibration

Phase II aims at estimating the PDF of each IPOI’s stock price at day 90

(i.e., PDF90IPOI). This PDF subsequently acts as the basis of 90-day VaR estimations.

One significant piece of information required to carry out Phase II is supplied by

Phase I; that is, the ID of the closest comparable to a given IPOI. This knowledge is

then utilized in a Bayesian setting to accomplish the goal pursued by Phase II.

Finding the closest comparable of each IPOI mandates defining a similarity metric

which would act as a measure for proximity and compute the distance between the

IPOI and each of its comparables. The distance function developed in Phase I serves

this aim, yielding a smaller distance for a closer comparable. To reduce uncertainty,

the parameter of the distance function is calibrated using a maximum likelihood

approach. Section 8.2.3 describes how the calibration is executed and presents the

outcomes. The procedure returns the value of 1.5 as the maximum likelihood

estimate of 𝛼𝛼. Recall that in the distance function, 𝛼𝛼 represents the coefficient of the

second term which gauges the difference between the efficiency scores. Hence, a

value of 𝛼𝛼 greater than zero, and unity for that matter, affirms the significance of

modelling the compromise between 𝜓𝜓 and 𝜆𝜆 and validates the notion of

incorporating the efficiency score as one dimension of comparison, postulated

originally in Sections 3.3 and 6.2.


223

Piece 3: Scrutiny of Estimated 𝐏𝐏𝐏𝐏𝐏𝐏𝟗𝟗𝟗𝟗𝐈𝐈𝐏𝐏𝐈𝐈𝐈𝐈’s

This piece is undoubtedly the most significant and critical of all the pieces since it

examines the ‘standalone’ accuracy of the estimated PDF90IPOI’s. Any test selected to

accomplish this task would simultaneously authenticate the accuracy of VaR

estimates since VaR is essentially a quantile and its credibility stems from the

accuracy of its underlying PDF. Section 8.2.4.1 proposes a backtesting technique

which scrutinizes the correctness of the estimated PDF90IPOI’s through inspecting the

PDF of the cumulative probability values (CPVs) associated with the realized 𝑉𝑉90’s.

For a given IPOI, the CPV of the realized 𝑉𝑉90 is computed based on the respective

estimated PDF90IPOI. It is proven that the theoretical PDF of the CPVs is uniform over

the interval of [0,1] (see, Proposition 1, Section 8.2.4.1). From Proposition 1, it

follows that if the proposed methodology of this thesis is projecting reliable

PDF90IPOI’s, one must fail to reject the null hypothesis ‘the observed set of 276 CPVs

has been drawn from a reference population with uniform density between 0 and 1’

against the alternative ‘the observed set does not come from the reference density’.

The following two statistical tests were employed to perform the hypothesis testing:

(i) Pearson’s chi-square test to assess ‘goodness of fit’, (ii) One-sample

Kolmogorov-Smirnov test (K-S test). The tests yielded the p-values of 0.9710 and

0.8893, respectively. Both p-values are sufficiently large to warrant the acceptance

of the null hypothesis and, hence, ensure that the proposed methodology conforms

to the prescribed standards.

Computing the frequencies of ‘VaR break’ was another approach undertaken to

further scrutinize the credibility of the estimated PDF90IPOI’s. The frequency

interpretation of probability can be employed to illuminate the rationale behind this

technique. Suppose that the entire process of quantifying risk can be regarded as a

‘system’ which estimates a 90-day VaR for a given IPOI. For an accurate system –

if it is utilized in 𝑛𝑛 cases of IPOIs to estimate the corresponding 90-day VaRs at the


224

confidence level of (1 − ῤ) – the frequency of VaR breaks must converge to ῤ as 𝑛𝑛

tends to infinity. The following VaR break frequencies were respectively reported

for the 90-day VaRs with the confidence levels of 95%, 90%, 85%, 80%: 0.0580,

0.1087, 0.1486, 0.1993. Due to the limited sample size, one does not expect the

realized frequencies to be exactly at par with the nominal frequencies (i.e., ῤ); yet,

they are sufficiently (and satisfactorily) close to validate the VaR approach.

In closing the preceding recapitulating discussion, it is noteworthy to point out that any

model is essentially just an abstraction of the reality it intends to reproduce. Therefore,

regardless of its complexity, a model could only be useful if it reflects the reality. By means of

the validation process, one can ensure that the model provides a sensible approximation of the

observed phenomena and conforms to the rules of the underlying system as well as the

objectives, principles and preferences of the original problem. These criteria have been

achieved by the risk methodology of this thesis. Here, the main objective was the

mathematical modelling of the short-term risk of IPOs and the validity and accuracy of the

methodology was established through backtesting analyses. One day, one might design a new

model to solve the same problem – a frequent scenario in the history of science. The point

indicated above then implies that this model would have to undergo the same backtesting and

scrutiny process as the proposed methodology of this research did. The model would only be

proven accurate and reliable if it meets the pre-set criteria and demonstrates satisfactory test

outcomes.

Chapter 9. Conclusions and Future Work 9.2. Future Research Directions and Application Prospects

225

9.2. Future Research Directions and Application

Prospects

The focus of this section can best be summarized as follows: The first section (i.e., Section

9.2.1) presents further applications of this research. The discussion then proceeds to briefly

outline potential future research directions (see, Section 9.2.1.3).

9.2.1. Extension of Applications

The primary motivation and application of the proposed methodology in this thesis is to

measure the short-term risk of a portfolio consisting of a single IPO. The applications are,

however, broad. Particularly pronounced is the capacity of the methodology to facilitate and

improve the risk analysis of portfolios comprising other assets, in addition to IPO stocks.

Furthermore, the methodology can be employed towards developing hedging strategies for

portfolios containing IPO shares. As clarified in the following sections, these two applications

form the basis for developing an approach which decomposes the VaR computed for an IPO

into its systematic and unsystematic components and estimates the contribution of each source

to the risk of the IPO. With inevitable brevity, the following three sections (i.e., Sections

9.2.1.1 to 9.2.1.3) are devoted to indicating the main points of the aforelisted applications.

9.2.1.1. Risk Measurement of Portfolios Containing IPOs

For various reasons, skilled investors may be persuaded to hold a diversified portfolio of

stocks of IPO and public companies. One such incentive could be diversifying away the risk of

IPOs. Regardless of the underlying motivation for building such portfolios, the investors

would seek to properly measure the corresponding risk. This task is, nevertheless, hard to

accomplish due to lack of sufficient market history which impedes developing a perspective

about the correlation between the returns of IPO and non-IPO stocks. The methodology

offered in this thesis provides a path towards mitigating this problem, and the remainder of


226

this section attempts to map out a more explicit formulation of this path using a simple

example:

Consider a portfolio of two assets: stocks of company IP and stocks of company PU.

Company PU is a mature public company with sufficient trading data, and company IP is on

the verge of going public. For the sake of simplicity, assume that the weights of the assets are

pre-set. The approach presented below aims at estimating the short-term risk of this portfolio.

In accordance with the methodology proposed in this thesis, the short-term risk is defined as a

90-day VaR computed based on the probability density function of the value of the portfolio

90 days after the issuing day of company IP. This probability density function is, hereafter,

referred to as PDF90Portfolio.

One prerequisite piece of information for estimating PDF90Portfolio is the correlation between

the (logarithmic) returns of company IP and company PU over the 90-day period commencing

on the first day of trade for company IP. The correlation can be approximated using the closest

comparable of company IP as detailed next:

Suppose Phase I has identified company CO as the closest comparable to company IP.

Recall that company CO is selected as the closest comparable in terms of its pre-IPO

characteristics, and at present, it is a public company with an acceptable length of market

history. The short-term correlation between the returns of company IP and company PU can be

approximated as the correlation between the returns of company CO, realized over the

respective 90-day post-IPO period and the returns of company PU, observed during the same

90-day window. Once the correlation between the returns of company IP and company PU is

estimated, one can proceed with developing an approach for estimating PDF90Portfolio. Drawing

upon the methodology proposed in this thesis, one such approach is presented below.


227

The value of the portfolio at time 𝑡𝑡, denoted by 𝑉𝑉𝑡𝑡P, can mathematically be demonstrated as

follows:

𝑉𝑉𝑡𝑡P = 𝑤𝑤PU𝑉𝑉𝑡𝑡PU + 𝑤𝑤IP𝑉𝑉𝑡𝑡IP (9.1)

where, the superscripts PU and IP designate company PU and company IP, respectively. The

variable 𝑤𝑤 stands for the weight of an asset in the portfolio, and 𝑉𝑉𝑡𝑡 represents value at time 𝑡𝑡.

Based on Equation (9.1), an infinitesimally small change in the value of the portfolio,

d𝑉𝑉𝑡𝑡P, over an infinitesimal time interval, d𝑡𝑡, can be expressed as shown below:

d𝑉𝑉𝑡𝑡P = 𝑤𝑤PUd𝑉𝑉𝑡𝑡PU + 𝑤𝑤IPd𝑉𝑉𝑡𝑡IP (9.2)

Recall that in Section 6.3, the stochastic process of price was modelled as a GBM.

Drawing upon the same assumption, d𝑉𝑉𝑡𝑡PU and d𝑉𝑉𝑡𝑡IP can be described as follows:

d𝑉𝑉𝑡𝑡PU = 𝜇𝜇PU𝑉𝑉𝑡𝑡PUd𝑡𝑡 + 𝜎𝜎PU𝑉𝑉𝑡𝑡PUd𝑍𝑍1,𝑡𝑡 (9.3)

d𝑉𝑉𝑡𝑡IP = 𝜇𝜇IP𝑉𝑉𝑡𝑡IPd𝑡𝑡 + 𝜎𝜎IP𝑉𝑉𝑡𝑡IPd𝑍𝑍2,𝑡𝑡 (9.4)

where, the parameters 𝜇𝜇 and 𝜎𝜎 are constants representing the expected rate of return and

volatility, respectively, and the variable 𝑍𝑍 follows a Wiener process. Substituting Equations

(9.3) and (9.4) into Equation (9.2) and rearranging yield the following equation:


228

d𝑉𝑉𝑡𝑡P = 𝑤𝑤PU𝑉𝑉𝑡𝑡PU�𝜇𝜇PUd𝑡𝑡 + 𝜎𝜎PUd𝑍𝑍1,𝑡𝑡� (9.5)

+ 𝑤𝑤IP𝑉𝑉𝑡𝑡IP�𝜇𝜇IPd𝑡𝑡 + 𝜎𝜎IPd𝑍𝑍2,𝑡𝑡�

Equation (9.5) can be written in terms of two uncorrelated Wiener processes. To this end,

d𝑍𝑍2,𝑡𝑡 is replaced by the following two terms:

d𝑍𝑍2,𝑡𝑡 = 𝜌𝜌d𝑍𝑍1,𝑡𝑡 + �1 − 𝜌𝜌2d𝑍𝑍3,𝑡𝑡 (9.6)

where, 𝑍𝑍3,𝑡𝑡 denotes a new Wiener process that is uncorrelated with 𝑍𝑍1,𝑡𝑡, and 𝜌𝜌 denotes the

correlation coefficient between the logarithmic returns of company PU and company IP, over

the time interval discussed above. By substituting Equation (9.6) into Equation (9.5), the

stochastic differential equation of the value of the portfolio, 𝑉𝑉𝑡𝑡P, takes the final form shown

below:

d𝑉𝑉𝑡𝑡P = �𝑤𝑤PU𝑉𝑉𝑡𝑡PU𝜇𝜇PU + 𝑤𝑤IP𝑉𝑉𝑡𝑡IP𝜇𝜇IP�d𝑡𝑡 (9.7)

+ �𝑤𝑤PU𝑉𝑉𝑡𝑡PU𝜎𝜎PU + 𝑤𝑤IP𝑉𝑉𝑡𝑡IP𝜎𝜎IP𝜌𝜌�d𝑍𝑍1,𝑡𝑡

+ 𝑤𝑤IP𝑉𝑉𝑡𝑡IP𝜎𝜎IP�1 − 𝜌𝜌2d𝑍𝑍3,𝑡𝑡

Subsequently, one can solve for the conditional PDF of the portfolio value on day 90,

given a specific combination of the parameters. The unconditional PDF of the portfolio value

on day 90 (i.e., PDF90Portfolio) can then be estimated in the same fashion as PDF90IPOI, using

Equation (6.42), if the following two considerations are taken into account:


229

Consideration 1. Company PU introduces two additional parameters into Equation (6.42).

Company PU represents a mature public company, and its sufficient trading history

provides some confidence regarding estimates of the respective parameters (i.e., 𝜇𝜇PU and

𝜎𝜎PU). It is, therefore, assumed that the parameters are not random and can be treated as

fixed values. This assumption reduces the number of parameters required to estimate

PDF90Portfolio and simplifies the solution, to a great extent. In the case of company IP (i.e.,

the IPO), the same approach as the one proposed in Chapter 6 is employed; that is, the

parameters 𝜇𝜇IP and 𝜎𝜎IP are assumed to be draws from the random variables ΜIPand ΣIP,

respectively. In addition, to better reflect the specific characteristics of the first day of

trade, the random variable of the ratio of first closing price to offer price is assumed to

follow a log-normal distribution whose underlying parameters (i.e., hyperparameters) are

random themselves. The hyperparameters are denoted by 𝜉𝜉IP and 𝜔𝜔IP which represent

realizations from the random variables ΞIP and ΩIP, correspondingly. Since ΞIP and ΩIP

could potentially be correlated with ΜIPand ΣIP, their distributions must be considered

jointly in order to find PDF90Portfolio. This joint distribution is designated with the function

𝑓𝑓ΜIP,ΣIP,ΞIP,ΩIP(𝜇𝜇IP,𝜎𝜎IP, 𝜉𝜉IP,𝜔𝜔IP).

Consideration 2. A closed-form expression for the conditional PDF of the portfolio value on

day 90 is absent.

Equation (6.42) was developed for a portfolio of a single asset (i.e., a given IPOI) whose

stochastic process of price movements was assumed to follow a GBM. Therefore, for a

given combination of the parameters, the conditional PDF of price on day 90 could be

expressed in closed form; it was a log-normal distribution, denoted by

ℒ𝑉𝑉90(𝑣𝑣90|Μ ≈ 𝜇𝜇, Σ ≈ 𝜎𝜎,𝑉𝑉1 ≈ 𝑣𝑣1) which can more precisely be written as

ℒ𝑉𝑉90IP(𝑣𝑣90IP�ΜIP ≈ 𝜇𝜇IP, ΣIP ≈ 𝜎𝜎IP,𝑉𝑉1IP ≈ 𝑣𝑣1IP). On the contrary, in the case of a portfolio

comprising two GBM assets, it seems hard to find a closed-form expression for the

conditional PDF of the portfolio value on day 90. It is left to be explored if the stochastic


230

differential equation of the portfolio value, presented in Equation (9.7), can be solved

analytically. At this stage, due to the absence of such a solution, for each combination of

the parameters, a Monte Carlo simulation can be performed to estimate the conditional

PDF of the portfolio value on day 90.

With these two considerations, Equation (6.42) is modified into the following form:

𝑓𝑓𝑉𝑉90P (𝑣𝑣90P ) (9.8)

= � � � � � 𝒢𝒢𝑉𝑉90P (𝑣𝑣90P �ΜIP ≈ 𝜇𝜇IP, ΣIP ≈ 𝜎𝜎IP,𝑉𝑉1IP ≈ 𝑣𝑣1IP, 𝜇𝜇PU,𝜎𝜎PU) +∞

𝑣𝑣1IP=−∞

+∞

𝜇𝜇IP=−∞

+∞

𝜎𝜎IP=−∞

+∞

𝜉𝜉IP=−∞

+∞

𝜔𝜔IP=−∞

× 𝑓𝑓V1IP(𝑣𝑣1IP|𝜉𝜉IP,𝜔𝜔IP) d𝑣𝑣1IP × 𝑓𝑓ΜIP,ΣIP,ΞIP,ΩIP(𝜇𝜇IP,𝜎𝜎IP, 𝜉𝜉IP,𝜔𝜔IP) d𝜇𝜇IPd𝜎𝜎IPd𝜉𝜉IPd𝜔𝜔IP

where, 𝑉𝑉90P denotes the value of the portfolio on day 90. The function 𝑓𝑓𝑉𝑉90P (𝑣𝑣90P ) represents

PDF90Portfolio which indeed stands for the unconditional PDF of 𝑉𝑉90P . The conditional PDF of

𝑉𝑉90P is presented as the function 𝒢𝒢𝑉𝑉90P (𝑣𝑣90P �ΜIP ≈ 𝜇𝜇IP, ΣIP ≈ 𝜎𝜎IP,𝑉𝑉1IP ≈ 𝑣𝑣1IP, 𝜇𝜇PU,𝜎𝜎PU) . The rest

of the notation remains consistent with the one presented earlier.

9.2.1.2. Hedging the IPO Investment Risk

The unconditional PDF (i.e., PDF90Portfolio) estimated above, can be utilized to develop

hedging strategies. In the previous section, it was assumed that the proportion of each asset in

the portfolio is predetermined, and the analysis aimed at quantifying the short-term risk of the

portfolio under the pre-set weights. One can, however, formulate an optimization problem to

find the optimum set of weights which minimizes the 90-day VaR of the portfolio, at a given

confidence level. To elaborate, for each combination of the weights, the steps indicated above

would be repeated to estimate PDF90Portfolio and the respective VaR. The set of weights under


231

which the portfolio exhibits the minimum VaR, or conversely the minimum risk, is selected as

the optimum set. One can partially hedge against the risk of the IPO by creating a portfolio on

the basis of the optimum weights. The optimization problem can be augmented with an

additional constraint to reflect the expected portfolio return desired by the investor. This

approach would offer a means to achieve the right balance between risk and reward.

9.2.1.3. IPO VaR Decomposition

One can benefit from the techniques discussed in the previous two sections (i.e., Sections

9.2.1.1 and 9.2.1.2) to decompose the computed VaR for a given IPOI into partial VaRs. A

case in point would be estimating the relative contributions of the systematic and unsystematic

components to the total VaR of IPOI. One possible approach to accomplish this goal is

described below:

Suppose that a 90-day VaR is computed for a given IPOI which was referred to as company

IP in the previous sections. Further suppose that a portfolio of assets IP and PU is created

using the optimum weights which minimize the portfolio VaR. Recall that the abbreviation PU

stands for ‘public’ and represents a mature public company. The difference between the VaR

of IPOI and the VaR of the portfolio can be considered as a proxy for the unsystematic

component of the risk of IPOI since it is the risk that vanishes upon diversification. It must be

noted that the suggested method serves as a tentative solution. In practice, often a relatively

large portfolio is required to completely eliminate the unsystematic risk.

9.2.2. Future Research: Incorporating Information from Other

Comparables

Phase I (see, Section 6.2) identifies a set of comparables for a given IPOI. The closest

comparable is selected as the one which exhibits the minimum distance from IPOI, where

‘distance’ is calculated using the similarity metric presented in Equation (6.37). Recall that the

proposed methodology only appoints the closest comparable’s posterior joint PDF of


232

parameters to act as the (prior) joint PDF of parameters required to estimate PDF90IPOI for the

respective IPOI. Stated otherwise, the information held by the rest of the comparables is

dismissed. A future research may focus on modifying Phase I such that it utilizes the

supplementary information offered by the remaining comparables. The scope of Phase I may

be extended in one of the following two directions:

Direction 1. Virtual Firm

The outcomes of Phase I can be used to form a virtual firm which would be positioned

closer to the given IPOI than any of the respective comparables. One can then proceed with

the virtual firm, as opposed to the closest comparable, to conduct Phase II. The following

details ought to be worked out before the method can be adopted in the analysis of the

short-term risk of IPOs:

The concept of virtual firm can be illustrated using a small-scale example. Figure 9-1.a

visualizes a simple one-input and one-output DEA example with a variable returns to scale

(VRS) frontier of efficiency. Suppose that firm G represents IPOI. Its set of comparables

would then consist of firms A, B, E, and F, encircled in grey (see, Section 6.2 for more

details). The virtual firm would be located in the convex hull of the set of comparables and

is defined as the convex combination of A, B, E, and F which falls the closest to firm G.

Firm K in Figure 9-1.b designates the virtual firm in this case. To attain a suitable

compromise, an additional optimization problem must be solved which would search for

the optimum vector of weights to be engaged towards building the virtual firm.

The virtual firm, however, adds to the intricacy of the analysis. If the GBM assumption is

maintained, the price process of the virtual firm can be modelled as the convex combination

of the GBM processes of the comparables. Equation (9.5) merges two GBM processes but

can be extended to address more. As discussed there, the result would not be a GBM

anymore. It can readily be deduced that the process of estimating the requisite, yet non-

existent, joint PDF of the parameters for a given IPOI becomes greatly complicated due to

the following reasons: (i) the price models for the virtual firm and for the IPOI are


233

(a)

(b)

Figure 9-1. The figure illustrates a one-input and one-output DEA example with a variable returns to

scale frontier of efficiency. Firm G represents the firm of interest whose comparables are encircled

in grey in the upper graph. The lower graph depicts the convex hull of the set of comparables as well

as the ‘virtual’ firm which resides on the boundary of the convex hull and is designated by Firm K.


234

incompatible, and so are the corresponding underlying parameters, (ii) a closed-form

solution for the stochastic differential equation of the virtual firm seems hard to achieve

which would increase the complexity of the Bayesian updating process, described in

Section 6.3.2.1. Both reasons call for extra refining steps in order to prepare the prior joint

PDF of the parameters of the virtual firm to be dispensed to the given IPOI.

Besides, the notion of virtual firm would convolute the problem in another way: it

introduces extra parameters pertaining to the other comparables. The increase in the number

of parameters would significantly magnify the computational burden.

As a side note, it must be stated that the total number of outstanding shares varies across

different firms and it impacts the per-share prices. Hence, one may use ‘market

capitalization’, as opposed to ‘price’, when combining the GBM processes. The achieved

model would then describe the evolution of the market capitalization of the virtual firm.

Direction 2. Compound Posterior Joint Density

Direction 2 can be considered as an alternative to Direction 1, presented above. Likewise,

it offers an approach which would utilize the information available from all the

comparables to enhance the proposed methodology in Chapter 6. Therein, only the closest

comparable’s posterior joint PDF of parameters was adopted as a proxy for the (prior) joint

PDF required to estimate PDF90IPOI. One can, however, incorporate the information held by

the remaining comparables into the analysis through building a compound posterior joint

PDF of parameters which can subsequently replace the posterior joint PDF of the closest

comparable in the methodology presented in Chapter 6. In order to form a compound

posterior joint PDF, first, the posterior joint PDFs of the remaining comparables must be

computed in the same fashion as that for the closest comparable. The compound posterior

joint PDF is then formed as a convex combination of the posterior joint PDFs of all the

comparables.


235

The challenge of the approach suggested here lies in determining the coefficients of

combination. The posterior joint PDFs ought to be combined on the basis of the probability

of observing each which must bear some relevance to the distance between the

corresponding comparable and IPOI. To this end, one can define a function which would

relate the distance between a given IPOI and each comparable (calculated using Equation

(6.37)) to the probability of observing the posterior joint PDF of the respective comparable.

Inspired by the Boltzmann distribution, one possible solution is demonstrated below:

𝑃𝑃𝑐𝑐𝑐𝑐 =e−𝛽𝛽𝐷𝐷𝑐𝑐𝑏𝑏

2

∑ 𝑒𝑒−𝛽𝛽𝐷𝐷𝑐𝑐𝑏𝑏2𝜂𝜂𝑐𝑐=1

(9.9)

where, 𝑃𝑃 represents the probability of the event where the prior joint PDF of IPOI is equal

to the posterior joint PDF of the respective comparable. In line with Equation (6.37), IPOI

is identified by the subscript 𝑏𝑏 and the comparable is marked by the subscript 𝑐𝑐, where

𝑐𝑐 = 1, … , 𝜂𝜂 and the variable 𝜂𝜂 indicates the total number of comparables associated with

IPOI. The distance between IPOI (i.e., 𝑏𝑏) and the corresponding comparable (i.e., 𝑐𝑐) is

denoted by 𝐷𝐷𝑐𝑐𝑐𝑐. The parameter 𝛽𝛽 is an unknown constant that can be estimated in a

calibration process. Note that according to Equation (9.9), as the distance between a given

IPOI and its comparable approaches infinity, the probability of the respective two joint

PDFs resembling each other tends to zero. Furthermore, the closer a comparable is

positioned to IPOI, the higher is the probability that its posterior joint PDF is identical to

the prior joint PDF of IPOI. Both inferences indicate that Equation (9.9) conforms to the

concept of ‘distance’ defined in this thesis. It, however, serves as a provisional solution,

pending further scrutiny and evaluation.

It is worthy of pointing out that the discussions presented under both ‘Direction 1’ and

‘Direction 2’ assume that the method of selecting the set of comparables for a given IPOI


236

remains intact and as described in Chapter 6. In future, one may try developing a different

method for this purpose; that is, investigating the possibility of incorporating new DEA

models into the process. Cases in point would be the recently suggested DEA models by

Aparicio, 2014 [9], which pursue the objective of finding the closest efficient target for an

inefficient DMU, or the BAM model proposed by Cooper et al., 2011 [40] (see, Section 4.3.5).

Once customized to meet the specific requirements of this research, such models could

potentially significantly improve the results and hence, the author puts forth the idea of

incorporating them in Phase I as a promising research avenue to be explored.

In conclusion, it must be reiterated that the latter section of this chapter closes the document

by first presenting more applications which further elaborate on the significance of this

research. Subsequently, the focus shifts to proposing two approaches that can be employed

towards benefiting from the extra knowledge provided by the comparables other than the

closest comparable. The discussion also outlines the methodological challenges concerning the

development and application of these methods. It bears repeating that this research has

succeeded in achieving its intent and has produced accurate and satisfactory outcomes in the

absence of the suggested extensions. These additional steps would considerably increase the

mathematical and computational intricacy of the process, where the accuracy gain is uncertain.

Thus, prior to incorporating them into the existing methodology, one must consider the trade-

off between increased complexity and superior outcomes.

237

REFERENCES

[1] Abdou K. and Dicle M. F., "Do Risk Factors Matter in the IPO Valuation?",

Journal of Financial Regulation and Compliance, vol. 15, no. 1, pp. 63- 89,

2007.

[2] Abidin S. N. Z. and Jaffar M. M., "A Review on Geometric Brownian

Motion in Forecasting the Share Prices in Bursa Malaysia", World Applied

Sciences Journal, vol. 17, pp. 87-93, 2012.

[3] Aggarwal R. , "Stabilization Activities by Underwriters after Initial Public

Offerings", The Journal of Finance, vol. 55, no. 3, pp. 1075-1103, 2000.

[4] Alford A. W., "The Effect of the Set of Comparable Firms on the Accuracy

of the Price-Earnings Valuation Method", Journal of Accounting Research, vol.

30, no. 1, pp. 94-108, 1992.

[5] Ali A. I. and Seiford L. M., "Translation Invariance in Data Envelopment

Analysis", Operations Research Letters, vol. 9, no. 6, pp. 403-405 , 1990.

[6] Altiok T. and Melamed B. , Simulation Modeling and Analysis with

ARENA.: Academic Press, Elsevier Inc., 2007.

[7] Anadol B. , "Valuing Private Companies: A Dea Approach", University of

Toronto, MASc Dissertation 2000.

[8] Andersen P. and Petersen N. C., "A Procedure for Ranking Efficient Units

in Data Envelopment Analysis", Management Science, vol. 39, no. 10, pp.

1261-1264 , 1993.

[9] Aparicio J. , Borras F. , Ortiz L. , and Pastor J. T., "Benchmarking in

Healthcare: An Approach Based on Closest Targets", in Managing Service

238

Productivity: Using Frontier Efficiency Methodologies and Multicriteria

Decision Making for Improving Service Performance, Emrouznejad A. and

Cabanda E. , Eds.: Springer, 2014, pp. 67-92.

[10] Arnold T. , Fishe R. P.H., and North D. , "The Effects of Ambiguous

Information on Initial and Subsequent IPO Returns", Financial Management,

vol. 39, no. 4, pp. 1497-1519, 2010.

[11] Arthurs J. D., Hoskisson R. E., Busenitz L. W., and Johnson R. A.,

"Managerial Agents Watching Other Agents: Multiple Agency Conflicts

Regarding Underpricing in IPO Firms", Academy of Management Journal, vol.

51, no. 2, pp. 277-294, 2008.

[12] Asquith D. , Jones J. D., and Kieschnick R. , "Evidence on Price

Stabilization and Underpricing in Early IPO Returns", The Journal of Finance,

vol. 53, no. 5, pp. 1759-1773, 1998.

[13] Asquith P. , Mikhail M. B., and Andrea S. A., "Information Content of

Equity Analyst Reports", Journal of Financial Economics, vol. 75, no. 2, pp.

245–282, 2005.

[14] Ayayi A. , "The 180-Day Lock-Up Period and Insiders’ Equity Selling",

Journal of Economics and Finance , vol. 29, no. 1, pp. 32-45 , 2005.

[15] Banker R. D., Charnes A. , and Cooper W. W., "Some Models for

Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis",

Management Science, vol. 30, no. 9, pp. 1078-1092, 1984.

[16] Banker R. D. and Morey R. C., "Efficiency Analysis for Exogenously Fixed

Inputs and Outputs", Operations Research, vol. 34, no. 4, pp. 513-521, 1986.

[17] Baron D. P., "A Model of the Demand for Investment Banking Advising

239

and Distribution Services for New Issues", The Journal of Finance, vol. 37, no.

4, pp. 955-976, 1982.

[18] Baron D. P. and Holmström B. , "The Investment Banking Contract for New

Issues Under Asymmetric Information: Delegation and the Incentive Problem",

The Journal of Finance, vol. 35, no. 5, pp. 1115-1138, 1980.

[19] Bartov E. , Mohanram P. , and Seethamraju C. , "Valuation of Internet

Stocks: An IPO Perspective", Journal of Accounting Research, vol. 40 , no. 2,

pp. 321-346, 2002.

[20] Benveniste L. M. and Spindt P. A., "How Investment Bankers Determine

the Offer Price and Allocation of New Issues", Journal of Financial

Economics, vol. 24, no. 2, pp. 343-361, 1989.

[21] Berger J. , "The Case for Objective Bayesian Analysis", Bayesian Analysis,

vol. 1, no. 3, pp. 385-402, 2006.

[22] Berger J. O. et al., "The Likelihood Principle", Lecture Notes-Monograph

Series, vol. 6, 1988.

[23] Berk J. B., Green R. C., and Naik V. , "Valuation and Return Dynamics of

New Ventures", The Review of Financial Studies, vol. 17, no. 1, pp. 1-35,

2004.

[24] Berry D. A., Statistics: A Bayesian Perspective.: Duxbury Press, 1996.

[25] Bhojraj S. and Lee C. M. C., "Who Is My Peer? A Valuation-Based

Approach to the Selection of Comparable Firms", Journal of Accounting

Research, vol. 40, no. 2, pp. 407-439, 2002.

[26] Biais B. and Faugeron-Crouzet A. M., "IPO Auctions: English, Dutch,...

French, and Internet", Journal of Financial Intermediation, vol. 11, no. 1, pp.

240

9–36, 2002.

[27] Black F. and Scholes M. , "The Pricing of Options and Corporate

Liabilities", Journal of Political Economy , vol. 81, no. 3, pp. 637-654, 1973.

[28] Bliss R. R. and Panigirtzoglou N. , "Option-Implied Risk Aversion

Estimates", The Journal of Finance, vol. 59, no. 1, pp. 407-446, 2004.

[29] Boatsman J. R. and Baskin E. F., "Asset Valuation with Incomplete

Markets", The Accounting Review, vol. 56, no. 1, pp. 38-53, 1981.

[30] Bogetoft P. and Otto, L., Benchmarking with DEA, SFA, and R.: Springer

Science+Business Media, 2011.

[31] Brau J. C. and Fawcett S. E., "Initial Public Offerings: An Analysis of

Theory and Practice", The Journal of Finance, vol. 61, no. 1, pp. 399-436,

2006.

[32] Brav A. and Lehavy R. , "An Empirical Analysis of Analysts' Target Prices:

Short-term Informativeness and Long-term Dynamics", The Journal of

Finance, vol. 58, no. 5, pp. 1933-1968, 2003.

[33] Certo S. T., Holcomb T. R., and Holmes Jr. R. M., "IPO Research in

Management and Entrepreneurship: Moving the Agenda Forward", Journal of

Management, vol. 35, no. 6, pp. 1340–1378, 2009.

[34] Chambers R. G., Chung Y. , and Färe R. , "Benefit and Distance Functions",

Journal of Economic Theory, vol. 70, no. 2, pp. 407-419, 1996.

[35] Chambers R. G., Chung Y. , and Färe R. , "Profit, Directional Distance

Functions, and Nerlovian Efficiency", Journal of Optimization Theory and

Applications, vol. 98, no. 2, pp. 351-364, 1998.

241

[36] Charnes A. , Cooper W. W., and Rhodes E. , "Measuring the Efficiency of

Decision Making Units", European Journal of Operational Research, vol. 2,

no. 6, pp. 429-444, 1978.

[37] Charnes A. , Cooper W. W., Rousseau J. , and Semple J. , "Data

Envelopment Analysis and Axiomatic Notions of Efficiency and Reference

Sets", Center for Cybernetic Studies, University of Texas, CCS Research

Report 558, 1987.

[38] Cheng G. , Zervopoulos P. , and Qian Z. , "A Variant of Radial Measure

Capable of Dealing with Negative Inputs and Outputs in Data Envelopment

Analysis", European Journal of Operational Research, vol. 225, no. 1, pp.

100-105, 2013.

[39] Chung Y. H., Färe R. , and Grosskopf S. , "Productivity and Undesirable

Outputs: A Directional Distance Function Approach", Journal of

Environmental Management, vol. 51, no. 3, pp. 229-240, 1997.

[40] Cooper W. W., Pastor J. T., Borras F. , Aparicio J. , and Pastor D. , "BAM:

A Bounded Adjusted Measure of Efficiency for Use with Bounded Additive

Models", Journal of Productivity Analysis, vol. 35, no. 2, pp. 85-94, 2011.

[41] Cooper W. W., Seiford L. M., and Tone K. , Data Envelopment Analysis: A

Comprehensive Text with Models, Applications, References and DEA-Solver

Software.: Springer Science+Business Media, 2007.

[42] Cornelli F. and Goldreich D. , "Bookbuilding and Strategic Allocation", The

Journal of Finance, vol. 56, no. 6, pp. 2337–2369, 2001.

[43] Cornelli F. and Goldreich D. , "Bookbuilding: How Informative Is the Order

Book?", The Journal of Finance, vol. 58, no. 4, pp. 1415–1443, 2003.

242

[44] Damien P. , Dellaportas P. , Polson N. G., and Stephens D. M., Bayesian

Theory and Applications.: Oxford University Press, 2013.

[45] De Sousa M. and Stošić B. , "Technical Efficiency of the Brazilian

Municipalities: Correcting Nonparametric Frontier Measurements for

Outliers", Journal of Productivity Analysis, vol. 24, no. 2, pp. 157-181, 2005.

[46] DeAngelo L. E., "Equity Valuation and Corporate Control", The Accounting

Review, vol. 65, no. 1, pp. 93-112, 1990.

[47] Dowd K. , Beyond Value at Risk: The New Science of Risk Management.:

John Wiley & Sons Ltd., 1999.

[48] Draho J. , The IPO Decision: Why and How Companies Go Public.: Edward

Elgar Publishing Limited, 2004.

[49] Emrouznejad A. , Anouze A. L., and Thanassoulis E. , "A Semi-Oriented

Radial Measure for Measuring the Efficiency of Decision Making Units with

Negative Data, Using DEA", European Journal of Operational Research, vol.

200, no. 1, pp. 297-304, 2010.

[50] Fama E. F. and French K. R., "Industry Costs of Equity", Journal of

Financial Economics, vol. 43, no. 2, pp. 153-193, 1997.

[51] Farrell M. J., "The Measurement of Productive Efficiency", Journal of the

Royal Statistical Society. Series A (General) , vol. 120, no. 3, pp. 253-290,

1957.

[52] Farzipoor Saen R. , "Developing a Nondiscretionary Model of Slacks-Based

Measure in Data Envelopment Analysis", Applied Mathematics and

Computation, vol. 169, no. 2, pp. 1440–1447, 2005.

[53] (2013) Federal Reserve Bank of St. Louis. [Online].

243

http://research.stlouisfed.org/fred2/series/GDPC1/downloaddata?cid=106

[54] Feldman M. S. and Pentland B. T., "Reconceptualizing Organizational

Routines as a Source of Flexibility and Change", Administrative Science

Quarterly, vol. 48, no. 1, pp. 94-118, 2003.

[55] Ferson S. , "Bayesian Methods in Risk Assessment", Dominique

GUYONNET, Head of Waste & Storage Unit, Working Paper 2004.

[56] Garfinkel J. A., "IPO Underpricing, Insider Selling and Subsequent Equity

Offerings: Is Underpricing a Signal of Quality?", Financial Management, vol.

22, no. 1, pp. 74-83, 1993.

[57] Geddes R. , IPOs and Equity Offerings. Oxford: Butterworth-Heinemann,

2003.

[58] Gibbons J. D. and Chakraborti S. , Nonparametric Statistical Inference.:

Taylor & Francis e-Library, 2005.

[59] Gilson S. C., Hotchkiss E. S., and Ruback R. S., "Valuation of Bankrupt

Firms", Review of Financial Studies, vol. 13, no. 1, pp. 43-74, 2000.

[60] Giordano L. and Siciliano G. , "Real-World and Risk-Neutral Probabilities

in the Regulation on the Transparency of Structured Products", Commissione

Nazionale per le Società e la Borsa (CONSOB), Working Paper 2013.

[61] Golany B. and Roll Y. , "Some Extensions of Techniques to Handle Non-

Discretionary Factors in Data Envelopment Analysis", The Journal of

Productivity Analysis, vol. 4, no. 4, pp. 419-432, 1993.

[62] Group RiskMetrics , "Risk-Metrics Technical Document", J.P. Morgan,

1986.

http://research.stlouisfed.org/fred2/series/GDPC1/downloaddata?cid=106

244

[63] Hadavandi E. , Shavandi H. , and Ghanbari A. , "Integration of Genetic

Fuzzy Systems and Artificial Neural Networks for Stock Price Forecasting",

Knowledge-Based Systems, vol. 23, no. 8, pp. 800-808, 2010.

[64] Hall J. , "The Impact of Growth, Volatility and Competitive Advantage on

the Value of Equity Investments and Their Embedded Options", University of

Queensland, PhD Thesis 2005.

[65] Hand J. R.M., "Profits, Losses and the Non-Linear Pricing of Internet

Stocks", Working Paper 2000. [Online]. SSRN: http://ssrn.com/abstract=204875 or

http://dx.doi.org/10.2139/ssrn.204875

[66] Hand J. R.M., "The Market Valuation of Biotechnology Firms and

Biotechnology R&D", Working Paper 2001. [Online]. SSRN:

http://ssrn.com/abstract=294471 or http://dx.doi.org/10.2139/ssrn.294471

[67] Hardle W. K. and Simar L. , Applied Multivariate Statistical Analysis, 3rd

ed.: Springer, 2012.

[68] Holton G. A., "Closed Form Value at Risk", Contingency Analysis, pp. 1-10,

1996.

[69] Holton G. , "Subjective Value at Risk", Risks and Rewards Newsletter, no.

31, pp. 14-16, 1998.

[70] Houston J. , James C. , and Karceski J. , "What a Difference a Month

Makes: Stock Analyst Valuations Following Initial Public Offerings", Journal

of Financial and Quantitative Analysis, vol. 41, no. 1, pp. 111-137, 2006.

[71] Howson C. and Urbach P. , Scientific Reasoning: The Bayesian Approach,

2nd ed.: Open Court Publishing Company, 1996.

[72] Hsu Y. T., Liu M. C., Yeh J. , and Hung H. F., "Forecasting the Turning

ssrn:%20http://ssrn.com/abstract=204875%20or%20http://dx.doi.org/10.2139/ssrn.204875




245

Time of Stock Market Based on Markov–Fourier Grey Model", Expert Systems

With Applications, vol. 36, no. 4, pp. 8597-8603, 2009.

[73] Huck S. W., Statistical Misconceptions.: Taylor & Francis Group, LLC,

2009.

[74] Hughes P. J. and Thakor A. V., "Litigation Risk, Intermediation, and the

Underpricing of Initial Public Offerings", Review of Financial Studies, vol. 5,

no. 4, pp. 709-742, 1992.

[75] Hull J. C., Risk Management and Financial Institutions, 3rd ed.: John Wiley

& Sons, Inc., 2012.

[76] Ibbotson R. G., "Price Performance of Common Stock New Issues", Journal

of Financial Economics, vol. 2, no. 3, pp. 235-272, 1975.

[77] Ibbotson R. G. and Jaffe J. F., "'Hot Issue' Markets", The Journal of

Finance, vol. 30, no. 4, pp. 1027-1042, 1975.

[78] Ibbotson R. G. and Ritter J. R., "Initial Public Offerings", in North-Holland

Handbooks of Operations Research and Management Science, Jarrow R. A.,

Maksimovic V. A., and Ziemba W. T., Eds., 1995, pp. 993-1016.

[79] Jaynes E. T. and Bretthorst G. L., Probability Theory: The Logic of

Science.: Cambridge University Press, 2003.

[80] Jorion P. , Value at Risk: The New Benchmark for Managing Financial Risk,

3rd ed.: McGraw-Hill Companies, Inc., 2006.

[81] Jorion P. and Talmor C. , "Value Relevance of Financial and Non Financial

Information in Emerging Industries: The Changing Role of Web Traffic Data",

Working Paper 2001.

246

[82] Kaplan S. N. and Ruback R. S., "The Valuation of Cash Flow Forecasts: An

Empirical Analysis", The Journal of Finance, vol. 50, no. 4, pp. 1059-1093,

1995.

[83] Kay S. , Fundamentals of Statistical Signal Processing, Volume I:

Estimation Theory.: Prentice Hall, 1993, vol. 1.

[84] Kerstens K. and De Woestyne V. , "Negative Data in DEA: A Simple

Proportional Distance Function Approach", The Journal of the Operational

Research Society, vol. 62, no. 7, pp. 1413-1419, 2011.

[85] Khindanova I. N. and Rachev S. T., "Value at risk: Recent advances", in

Handbook on Analytic-Computational Methods in Applied Mathematics.:

University of California, Santa Barbara, Department of Economics, 2000.

[86] Khindanova I. N., Rachev S. , and Schwartz E. , "Stable Modeling of Value

at Risk", Mathematical and Computer Modelling, vol. 34, no. 9-11, pp. 1223-

1259, 2001.

[87] Kim M. and Ritter J. R., "Valuing IPOs", Journal of Financial Economics,

vol. 53, no. 3, pp. 409-437, 1999.

[88] Koopmans T. C., "Analysis of Production as an Efficient Combination of

Activities", in Activity Analysis of Production and Allocation., 1951, pp. 33-97.

[Online]. http://www.policonomics.com/wp-content/uploads/Analysis-of-Production-

as-an-Efficient-Combination-of-Activities.pdf

[89] Lee C. , "Underwriter Reputation and the Decision to Go Public", Journal of

Finance and Accountancy, vol. 6, pp. 108-128, 2011.

[90] Ljungqvista A. P. and Wilhelm Jr. W. J., "IPO Allocations: Discriminatory

or Discretionary?", Journal of Financial Economics, vol. 65, no. 2, pp. 167–

http://www.policonomics.com/wp-content/uploads/Analysis-of-Production-as-an-Efficient-Combination-of-Activities.pdf

http://www.policonomics.com/wp-content/uploads/Analysis-of-Production-as-an-Efficient-Combination-of-Activities.pdf

247

201, 2002.

[91] Ljungqvist A. P., Jenkinson T. , and Wilhelm Jr. W. J., "Global Integration

in Primary Equity Markets: The Role of U.S. Banks and U.S. Investors", The

Review of Financial Studies, vol. 16, no. 1, pp. 63-99 , 2003.

[92] Ljungqvist A. , Nanda V , and Singh R. , "Hot Markets, Investor Sentiment,

and IPO Pricing", The Journal of Business, vol. 79, no. 4, pp. 1667-1702, 2006.

[93] Loughran T. and Ritter J. R., "The New Issues Puzzle", The Journal of

Finance, vol. 50, no. 1, pp. 23-51, 1995.

[94] Lovell C.A. K., "Measuring the Macroeconomic Performance of the

Taiwanese Economy", International Journal of Production Economics, vol. 39,

no. 1-2, pp. 165–178, 1995.

[95] Lovell C.A. K. and Pastor J. T., "Units Invariant and Translation Invariant

DEA Models", Operations Research Letters, vol. 18, no. 3, pp. 147–151, 1995.

[96] Malakoff D. , "Bayes Offers a 'New' Way to Make Sense of Numbers",

Science, vol. 286, no. 5444, pp. 1460-1464 , 1999.

[97] Maya C. , "In Search of the True Value of a Start-Up Firm: Creative

Destruction and Real Options Approach", in Proceedings of the 8th Annual

International Conference on Real Options, 2004.

[98] Merton R. C., "Theory of Rational Option Pricing", The Bell Journal of

Economics and Management Science , vol. 4, no. 1, pp. 141-183, 1973.

[99] Mikkelson W. H., Partch, M. M. and Shah K. , "Ownership and Operating

Performance of Companies That Go Public", Financial Economics, vol. 44, no.

3, pp. 281-307, 1997.

248

[100] Muñiza M. , Paradi J. , Ruggiero J. , and Yang Z. , "Evaluating Alternative

DEA Models Used to Control for Non-Discretionary Inputs", Computers &

Operations Research, vol. 33, no. 5, pp. 1173–1183, 2006.

[101] Myers S. C., "Finance Theory and Financial Strategy", Strategic

Management, vol. 14, no. 1, pp. 126-137, 1984.

[102] O'Hagan A. , "The Bayesian Approach to Statistics", in Handbook of

Probability: Theory and Applications.: Sage Publications, Inc., 2008.

[103] Ohlson J. A., "Eamings, Book Values, and Dividends in Equity Valuation",

Contemporary Accounting Research, vol. 11, no. 2, pp. 661–687, 1995.

[104] Pagano M. , Panetta F. , and Zingales L. , "Why Do Companies Go Public?

An Empirical Analysis", The Journal of Finance, vol. 53, no. 1, pp. 27-64,

1998.

[105] Palepu K. G. and Healy P. M., Business Analysis and Valuation: Using

Financial Statements, 4th ed.: Thomson South-Western, 2008.

[106] Panik M. J., Advanced Statistics from an Elementary Point of View.:

Elsevier Academic Press, 2005.

[107] Panneerselvam R. , Research Methodology.: Prentice-Hall, 2004.

[108] Pastor J. T., "Translation Invariance in Data Envelopment Analysis: A

Generalization", Annals of Operations Research, vol. 66, pp. 93-102, 1996.

[109] Pastor J. T. and Aparicio J. , "Translation Invariance in Data Envelopment

Analysis", in Data Envelopment Analysis: A Handbook of Models and

Methods, Zhu J. , Ed.: Springer Verlag, 2015.

[110] Pastor J. T. and Ruiz J. L., "Variables With Negative Values In DEA", in

249

Modeling Data Irregularities and Structural Complexities in Data

Envelopment Analysis, Zhu J. and Cook W. D., Eds.: Springer, 2007, ch. 4, pp.

63-84.

[111] Pastor J. T., Ruiz J. L., and Sirvent I. , "An Enhanced DEA Russell Graph

Efficiency Measure", European Journal of Operational Research, vol. 115, no.

3, pp. 596-607, 1999.

[112] Pinedo M. and Walter I. , Eds., Global Asset Management: Strategies, Risks,

Processes, and Technologies.: SimCorp StrategyLab, 2013.

[113] Prem K. P. et al., "Risk Measures Constituting a Risk Metrics which

Enables Improved Decision Making: Value-at-Risk", Journal of Loss

Prevention in the Process Industries, vol. 23, no. 2, pp. 211-219, 2010.

[114] Purnanandam A. K. and Swaminathan B. , "Are IPOs Really Underpriced?",

The Review of Financial Studies, vol. 17, no. 3, pp. 811-848, 2004.

[115] Rachev S. T., Hsu S. J. J., Bagasheva B. S., and Fabozzi F. J., Bayesian

Methods in Finance.: John Wiley & Sons, Inc., 2008.

[116] Ray S. C., "Data Envelopment Analysis, Nondiscretionary Inputs and

Efficiency: An Alternative Interpretation", Socio-Economic Planning Sciences,

vol. 22, no. 4, pp. 167-176, 1988.

[117] (2013) Renaissance Capital: The IPO Expert. [Online]. http://www.renaissancecapital.com/RenCap/AboutUs/WhyIPOs.aspx

[118] Ritter J. R., "The 'Hot Issue' Market of 1980", The Journal of Business, vol.

57, no. 2, pp. 215-240, 1984.

[119] Ritter J. R., "The Long-Run Performance of Initial Public Offerings", The

Journal of Finance, vol. 46, no. 1, pp. 3-27, 1991.

http://www.renaissancecapital.com/RenCap/AboutUs/WhyIPOs.aspx

250

[120] Roosenboom P. , "How Do Underwriters Value Initial Public Offerings? An

Empirical Analysis of the French IPO Market", Contemporary Accounting

Research, vol. 24, no. 4, pp. 1217–1243, 2007.

[121] Roosenboom P. , "Valuing and Pricing IPOs", Journal of Banking and

Finance, vol. 36, no. 6, pp. 1653-1664, 2012.

[122] Ross S. A., Westerfield R. W., Jordan B. D., and Roberts G. S.,

Fundamentals of Corporate Finance.: McGraw-Hill Ryerson, 2010.

[123] Ruggiero J. , "Non-Discretionary Inputs in Data Envelopment Analysis",

European Journal of Operational Research, vol. 111, no. 3, pp. 461–469,

1998.

[124] Ruggiero J. , "On the Measurement of Technical Efficiency in the Public

Sector", European Journal of Operational Research, vol. 90, no. 3, pp. 553–

565, 1996.

[125] Ruud J. S., "Underwriter Price Support and the IPO Underpricing Puzzle",

Journal of Financial Economics, vol. 34, no. 2, pp. 135-151, 1993.

[126] Scheel H. , "Undesirable Outputs in Efficiency Valuations", European

Journal of Operational Research, vol. 132, no. 2, pp. 400-410, 2001.

[127] Schwartz E. S. and Moon M. , "Rational Pricing of Internet Companies",

Financial Analysts Journal, vol. 56, no. 3, pp. 62-75, 2000.

[128] Schwartz E. S. and Moon M. , "Rational Pricing of Internet Companies

Revisited", The Financial Review, vol. 36, no. 4, pp. 7-26, 2001.

[129] Seidl I. and Sommersguter-Reichmann M. , "Visualizing Production

Surfaces in 3D Diagrams", Advances in Operations Research, vol. 2011, pp. 1-

17, 2011.

251

[130] Seiford L. M. and Zhu J. , "Modeling Undesirable Factors in Efficiency

Evaluation", European Journal of Operational Research, vol. 142, no. 1, pp.

16–20, 2002.

[131] Sharp J. A., Meng W. , and Liu W. , "A Modified Slacks-Based Measure

Model for Data Envelopment Analysis with 'Natural' Negative Outputs and

Inputs", The Journal of the Operational Research Society, vol. 58, no. 12, pp.

1672-1677, 2007.

[132] Silva Portela M. C. A. , Thanassoulis E. , and Sim G. , "Negative Data in

DEA: A Directional Distance Approach Applied to Bank Branches", The

Journal of the Operational Research Society, vol. 55, no. 10, pp. 1111-1121,

2004.

[133] Simak P. C., "Inverse and Negative DEA and their Application to Credit

Risk Evaluation", University of Toronto, PhD Dissertation 2000.

[134] Siu T. K., Tong H. , and Yang H. , "On Bayesian Value at Risk: From

Linear to Non-Linear Portfolios", Asia-Pacific Financial Markets, vol. 11, pp.

161-184, 2004.

[135] Teoh S. H., Wong T. J., and Rao G. R., "Are Accruals during Initial Public

Offerings Opportunistic?", Review of Accounting Studies, vol. 3, no. 1-2, pp.

175–208, 1998.

[136] Tone K. , "A Slacks-Based Measure of Efficiency in Data Envelopment

Analysis", European Journal of Operational Research, vol. 130, no. 3, pp.

498-509, 2001.

[137] Tone K. , "A Slacks-Based Measure of Super-Efficiency in Data

Envelopment Analysis", European Journal of Operational Research, vol. 143,

no. 1, pp. 32-41, 2002.

252

[138] Vincent-Humphreys R. and Noss J. , "Estimating Probability Distributions

of Future Asset Prices: Empirical Transformations from Option-Implied Risk-

Neutral to Real-World Density Functions", Working Paper No. 455 2012.

[139] Wu L. C., Ong C. S., and Hsu Y.W. , "Knowledge-Based Organization

Evaluation", Decision Support Systems, vol. 45, no. 3, pp. 541-549, 2008.

[140] Yang Z. and Paradi J. C., "Cross Firm Bank Branch Benchmarking Using

'Handicapped' Data Envelopment Analysis to Adjust for Corporate Strategic

Effects", in Proceedings of the 39th Hawaii International Conference on

System Sciences, 2006. [Online]. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=01579362

[141] Zhu J. , "Multi-Factor Performance Measure Model with an Application to

Fortune 500 Companies", European Journal of Operational Research, vol.

123, no. 1, pp. 105-124, 2000.

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=01579362

253

Appendix A. COMPLEMENTARY DETAILS ON THE

CALIBRATION OF THE DISTANCE FUNCTION

A.1 Introduction

Section 8.2.3 detailed the calibration process developed to find the maximum likelihood

estimate of the parameter 𝛼𝛼 which resides in the distance function (see, Equation (6.37)). This

Appendix summarizes the outcomes of the side analyses performed to further elucidate the

mechanism underlying Figure 8-7 and Figure 8-10. First, Section A.2 reports the results of the

two tests of ‘uniform cumulative probabilities’ (see, Section 8.2.4.1) and ‘VaR break

frequency’ (see, Section 8.2.4.2), conducted on the basis of all the 276 IPOIs, for the cases

where 𝛼𝛼 is set to values other than 1.5. Second, Section A.3 focuses on a small cluster depicted

in Figure 8-10, which appears to be an anomaly. For any subset that falls under this cluster, the

corresponding calibration outputs 0.5 as the maximum likelihood estimate of 𝛼𝛼. Section A.3

provides some visual aid to demonstrate the features of the log-likelihood functions associated

with this cluster.

A.2 Supplementary Discussion on Section 8.2.4: Scrutiny

of Estimated 𝐏𝐏𝐏𝐏𝐏𝐏𝟗𝟗𝟗𝟗𝐈𝐈𝐏𝐏𝐈𝐈𝐈𝐈’s under other Values of 𝜶𝜶

The calibration process, developed in Section 8.2.3, reported 1.5 as the maximum

likelihood estimate of 𝛼𝛼 when all the 276 IPOIs participate in the analysis. Further analysis

indicated that 1.5 can be presented as a property of the distance function and may be utilized

to skip the computationally intensive step of calibration; that is, the risk assessment of a given

IPO can be performed by assuming 𝛼𝛼 = 1.5, a priori.

Subsequently, Section 8.2.4 evaluated the standalone accuracy of the set of 276 PDF90IPOI’s

produced under 𝛼𝛼 = 1.5. In this section, the same process is iterated using the following values

Appendix A. Complementary Details on the Calibration of the Distance Function

A.2 Scrutiny of Estimated Probability Density Functions: Other Values of 𝛼𝛼

254

of 𝛼𝛼: [0, 0.5, 1.25, 1.9, ≥ 3]. It has already been established in Section 8.2.3 that the PDF90IPOI’s

generated using 𝛼𝛼 = 1.5 are typically more accurate compared to the ones obtained using the

above-indicated 𝛼𝛼 values. Here, however, the goal is to investigate whether the individual sets

of PDF90IPOI’s associated with these 𝛼𝛼 values would pass the tests of ‘uniform cumulative

probabilities’ and ‘VaR break frequency’.

Figure A-1 to Figure A-5, which are presented at the end of this subsection, visualize the

results of the test of uniform cumulative probability values (CPVs) for the individual 𝛼𝛼 values.

The corresponding outcomes of the K-S test are displayed in Table A-1. Recall that for a given

𝛼𝛼, the K-S test is conducted under the null hypothesis that ‘the observed set of 276 CPVs has

been drawn from a reference population with uniform density between 0 and 1’ (see, Section

8.2.4.1.1).

Table A-1. The table reports the individual K-S p-values computed under different assumptions of 𝛼𝛼. For a

given 𝛼𝛼, the null hypothesis can be stated as ‘the observed set of 276 CPVs has been drawn from a reference

population with uniform density between 0 and 1’. Results indicate that in the cases of 0, 0.5, 1.9, and ≥ 3,

there is no strong evidence in favour of the null hypothesis, which suggests that the respective observed

sample set would be considered a rare event if one assumed that the null hypothesis is valid. It can, therefore,

be concluded that the individual sets of PDF90IPOI’s produced using these 𝛼𝛼 values have failed to fulfil the

objectives of the test of uniform CPVs, which implies that no convincing evidence exists to support the

standalone accuracy of each of these sets.

𝜶𝜶 0 0.5 1.25 1.5 1.9 ≥ 3

p-Value of the K-S Test 0.0195 0.0269 0.3048 0.8893 0.0082 0.0018



255

Through the visual inspection of the shapes of the histograms and drawing upon the p-

values reported in Table A-1, it is concluded that there is no significant evidence to accept the

null hypothesis in the cases of the individual sets of PDF90IPOI’s obtained using the following

values of 𝛼𝛼: [0, 0.5, 1.9, ≥ 3]. The null hypothesis cannot, however, be rejected for the set of

PDF90IPOI’s generated with the assumption of 𝛼𝛼 = 1.25. The values of 𝛼𝛼 can be sorted in the

ascending order of the respective p-values: [≥ 3, 1.9, 0, 0.5, 1.25]; only the last member of the

set (i.e., 𝛼𝛼 = 1.25) exhibits a p-value which provides sufficient confidence in the accuracy of

the corresponding set of PDF90IPOI’s. Yet, the p-value reported in the case of 𝛼𝛼 = 1.25 (i.e.,

0.3048) still falls inferior to the p-value obtained for 𝛼𝛼 = 1.5 (i.e., 0.8893).

The results of the test of VaR break frequency are reported in Table A-2. As presented, at

all the four confidence levels of 95%, 90%, 85%, 80%, the risk is ‘overestimated’ in the cases

Table A-2. Each cell presents the realized VaR break frequency under the corresponding 𝛼𝛼 and nominal

frequency of VaR breaks. The exhibited frequencies can be interpreted in a similar fashion to those displayed

in Table 8-1 (see, Section 8.2.4.2).

Realized

Frequency of

VaR Breaks

𝜶𝜶

0 0.5 1.25 1.5 1.9 ≥ 3

Nom

inal

Fre

quen

cy o

f

VaR

Bre

aks

0.05 0.0290 0.0326 0.0399 0.0580 0.0761 0.0978

0.1 0.0580 0.0616 0.0906 0.1087 0.1377 0.1594

0.15 0.0978 0.1014 0.1341 0.1486 0.2138 0.2391

0.2 0.1341 0.1377 0.1848 0.1993 0.2609 0.2826



256

of 𝛼𝛼 = 0 and 0.05, and to a lesser degree, under 𝛼𝛼 = 1.25. For the higher values of 𝛼𝛼 (i.e., 1.9

and ≥ 3), the realized VaR break frequencies are the indicative of risk ‘underestimation’.

The implications of the outcomes of both tests can be recapitulated as follows: First, using

the set of all IPOIs, the highest accuracy was achieved under the 𝛼𝛼 estimate equal to 1.5. As

the value of 𝛼𝛼 deviates from 1.5, so does the standalone accuracy of the corresponding set of

estimated PDF90IPOI’s. In Section 8.2.3, some possible underlying mechanisms leading to such

results are discussed at length. Second, it can straightforwardly be verified that 𝛼𝛼 has the

capacity to influence the standalone accuracy of estimated PDF90IPOI’s, which attests to the

efficacy of the second term of the distance function (i.e., the 𝜓𝜓 dimension) in selecting the

closest comparable.



257

Figure A-1. The histogram is built using 276 CPVs which are computed on the basis of the respective set of 276

PDF90IPOI’s, estimated under the assumption of 𝛼𝛼 = 0.


PDF90IPOI’s, estimated under the assumption of 𝛼𝛼 = 0.5.



258







259


PDF90IPOI’s, estimated under the assumption of 𝛼𝛼 ≥ 1.9. Recall from Section 8.2.3 that starting at the vicinity of

𝛼𝛼 = 3, the log-likelihood function stabilizes which indicates that the second term (i.e., the 𝛼𝛼 term) has dominated

the first term of the similarity metric in all the 276 IPOI cases. Therefore, the individual sets of PDF90IPOI’s

estimated from this point onward remain unchanged, which consequently translates into invariant sets of CPVs,

and hence, identical CPV histograms.


A.3 Behaviour of Log-Likelihood Functions of 𝛼𝛼

260

A.3 Behaviour of Log-Likelihood Functions of 𝜶𝜶

As indicated earlier, the focal intent of this section is to investigate the structure and

characteristics of the log-likelihood functions underlying a cluster of 9 consecutive points

depicted in Figure 8-10. The common feature of the points within this cluster is their

comparatively low maximum likelihood estimate of 𝛼𝛼 (i.e., 𝛼𝛼 = 0.5), which called for further

scrutiny; the first subset of the IPOs which exhibits this feature comprises the 52nd to 141st

IPOs. The behaviour of the log-likelihood functions associated with the points within this

cluster can be categorized into three different types:

First. Figure A-6 visualizes an example of the first type of behaviour. It depicts the log-

likelihood function obtained on the basis of the subset consisting of the 52nd to 141st IPOs.

Similar shapes were observed for the succeeding two subsets (i.e., 52nd IPO to 142nd IPO

and 52nd IPO to 143rd IPO). As illustrated in Figure A-6, 𝛼𝛼 = 0.5 represents the ‘argument’

(arg) of a local maximum which has won the position of global arg maximum from 1.8,

with only a marginal difference. Note that in Figure 8-10, the maximum likelihood estimate

of 𝛼𝛼 is reported as 1.8 immediately before plunging to the level of 0.5. Although no local

maximum occurs at 1.5, its likelihood is situated very close to the local maximums; in

Figure A-6, the difference between the log-likelihood values at 𝛼𝛼 = 0.5 and 𝛼𝛼 = 1.5 is

equal to 0.03, which translates into 1.07 on the linear scale.

Second. The sets of 52nd IPO to 144th IPO and 52nd IPO to 145th IPO fall under this class.

The log-likelihood function of the former set is presented in Figure A-7. Similar to the

‘first’ type, 𝛼𝛼 = 0.5 marks both the local and global maximums. Here, however, the second

local maximum occurs at 𝛼𝛼 = 1.5 whose likelihood value continues to remain close to that

of the global maximum.

Third. The rest of the subsets within this cluster belong to the third category. As a case in

point, consider Figure A-8 which depicts the log-likelihood function of the subset



261

comprising the 52nd to 146th IPOs. As demonstrated, 𝛼𝛼 = 0.5 and 𝛼𝛼 = 1.5 share the position

of maximum likelihood estimate of 𝛼𝛼.

To further elaborate on the microstructure of Figure 8-10, refer to Figure A-9. This figure

presents the log-likelihood function of the subset consisting of the 52nd to 150th IPOs. For this

subset, the maximum likelihood estimate of 𝛼𝛼 has reverted to 1.8. Although 𝛼𝛼 = 0.5 may still

be considered the argument of a local maximum, it is associated with a significantly lower log-

likelihood value compared to the new global maximum. Likewise, the gap between the

likelihood values at 𝛼𝛼 = 0.5 and 𝛼𝛼 = 1.5 has widened, and here, 1.5 is showing a noticeably

higher likelihood value. Note that the deviation of the likelihood of 𝛼𝛼 = 1.5 from the global

maximum continues to remain small.

In short, the core of all the aforementioned details can be stated as follows: When 0.5 is

selected as the maximum likelihood estimate of 𝛼𝛼, there exists another local maximum which

either rests at the same level as the likelihood of 𝛼𝛼 = 0.5, or is surpassed by it with a slight

margin. Therefore, although it first appeared to be an anomaly, the maximum likelihood

estimate of 0.5 is indeed consistent with the rest of the results exhibited by Figure 8-10.

Considering these observations in conjunction with the trend of the maximum likelihood

estimate of 𝛼𝛼, demonstrated in Figure 8-10, it can be concluded that the maximum at 𝛼𝛼 = 0.5

is unstable and will be dominated by the one at 1.5 once the subset size rises beyond a certain

threshold.



262

Figure A-6. The graph demonstrates the log-likelihood function of 𝛼𝛼 computed on the basis of the subset

consisting of the 52nd to 141st IPOs. This figure serves as an example of the ‘first type’ of behaviour described on

Page 260. Similar behaviours have been recorded for the two succeeding subsets (i.e., 52 to 142 and 52 to 143).


consisting of the 52nd to 144th IPOs. This figure serves as an example of the ‘second type’ of behaviour described

on Page 260. A similar behaviour has been observed for the set of the 52nd to 145th IPOs.



263


consisting of the 52nd to 146th IPOs. This figure serves as an example of the ‘third type’ of behaviour described on

Page 260. Similar behaviours have been recorded for the three succeeding subsets (i.e., 52 to 147, 52 to 148, and

52 to 149).

Figure A-9. The graph exhibits the log-likelihood function of 𝛼𝛼 computed on the basis of the subset consisting of

the 52nd to 150th IPOs. More details regarding this plot are provided on Page 261.

Documents

A HYBRID BAYESIAN AND DATA-ENVELOPMENT-ANALYSIS … · quantified as a value-at-risk (VaR) inferred from the probability density function of price on day 90 (i.e., PDF. 90IPOI). This