41
Non-parametric calibration of the local volatility surface for European options Nov 8th, 2011 Jian Geng Department of Mathematics, Florida State University, Room 208, 1017 Academic Way, Tallahassee, FL, 32306-4510, USA. Email: [email protected] I. Michael Navon Department of Scientific Computing, Florida State University, 400 Dirac Science Library, Tallahassee, FL, 32306-4120, USA. Email: [email protected] Xiao Chen Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA 94551, USA. Email: [email protected] 1

Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

Non-parametric calibration of the local volatilitysurface for European options

Nov 8th, 2011

Jian GengDepartment of Mathematics, Florida State University, Room 208, 1017 Academic Way,Tallahassee, FL, 32306-4510, USA. Email: [email protected]

I. Michael NavonDepartment of Scientific Computing, Florida State University, 400 Dirac ScienceLibrary, Tallahassee, FL, 32306-4120, USA. Email: [email protected]

Xiao ChenCenter for Applied Scientific Computing, Lawrence Livermore National Laboratory,Livermore, CA 94551, USA. Email: [email protected]

1

Page 2: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

Abstract

In this paper, we explore a robust method for calibration of the local volatil-ity surface for European options. Assuming the the volatility surface is smooth,we apply a second order Tikhonov regularization to the calibration problem. Ad-ditionally we propose a new approach for choosing the Tikhonov regularizationparameter. Using the TAPENADE automatic differentiation tool in order to ob-tain adjoint code of the direct model is employed as an efficient way to obtain thegradient of cost function with respect to the local volatility surface. Finally weperform four numerical tests aimed at assessing and verifying the aforementionedtechniques.

key words: local volatility surface, second order Tikhonov, iterative regularization,adjoint

2

Page 3: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

1 Introduction

The celebrated Black-Scholes model under the assumption of constant volatility has es-tablished itself both in theory and practice as the classical model for pricing Europeanstyle options (Black and Scholes (1973)). Under this assumption, the implied volatilityfor all options of the same underlying would be the same. However, in the market it isusually observed that the Black-Scholes implied volatility varies with both strikes andmaturity, which are respectively referred to as volatility smile or sometimes volatilityskew and term structure of a volatility surface to reflect the change of implied volatilityin space and time direction (Hull (2009)). Sometimes the volatility smile is just used asa general term to describe any variations of the implied volatility surface.

There have been many studies to extend the Black-Scholes theory to account for thevolatility smile and its term structure. Broadly speaking, there are two directions of suchstudies: one direction introduces jumps (Merton (1976)), stochastic volatility (Heston(1993)), or both; while the other direction considers the volatility as a deterministicfunction that depends on both price (or strikes) and time (or maturities), which is usuallycalled the local volatility model. The local volatility model is an one factor model andthus retains the completeness of the model, which means hedging options with just theunderlying asset is possible(Coleman et al. (1999)). Which volatility model is better doesnot constitute the subject of this paper. Crepey (2004) showed that the local delta (thedelta of an option with local volatility) provides a better hedge than the implied deltafrom Black-Scholes model using both simulated and real time-series of equity-index data.Gatheral (2006) proved the variance of local volatility as a conditional expectation ofinstantaneous variance. This paper addresses the calibration of local volatility modelswith respect to European options. However, the techniques introduced here can beapplied to calibrate other volatility models or other options.

There have been a series of studies about calibration of local volatility models of Eu-ropean style options. It was established in the seminal work of Dupire (1994) that thelocal volatility function can be uniquely determined given the existence of the Europeanoptions of all strikes and maturities. However, there are only a limited number of Euro-pean options available. Interpolation or extrapolation of the sparse market prices to fillthe gap, such as studies of (Dupire (1994); Derman and Kani (1994); Rubinstein (1994)and Avellaneda et al. (1997)) are known to be subject to both artificial misinterpretationand stability issues(Crepey (2003b)).

There is another direction of research solving the problem as an inverse problemwithout any interpolation or extrapolation of market prices. Generally, providing theparameters of a model to compute the output of a model is referred to as a forwardproblem while providing the output of a model to recover the parameters is referred toas an inverse problem. Most inverse problems are ill-posed, which is due to the natureof inverse problems, see Hansen (1998) some for good explanations. Our calibrationproblem is also ill-posed. We want to point out here that the ill-posedness is due toboth the presence of noisy data and the nature of the inverse problem rather than thediscrete and finite observations available as pointed out in Lagnado and Osher (1997).

3

Page 4: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

To control the ill-posedness of the inverse problem, some regularization is needed. Themost popular regularization is Tikhonov regularization.

In this research direction, Lagnado and Osher (1997) solve the inverse problem in anon-parametric space, i.e., without assuming any shape of the local volatility surface.They use the first order derivatives of the volatility surface to regularize the inverseproblem together with an expensive way to compute the gradient of a cost function.Most research that followed afterwards used the same approach, namely addressing theproblem in terms of using the first order derivatives of the volatility surface to regularizethe inverse problem, such as Bouchouev and Isakov (1997, 1999); Jiang and Tao (2001);Jiang et al. (2003); Crepey (2003a); Egger and Engl (2005); Hein (2005); Achdou andPironneau (2005) and Turinici (2009). However, the optimal volatility surface is usuallynot smooth enough.

Theoretical studies such as Bouchouev and Isakov (1997, 1999)); Jiang and Tao(2001); Crepey (2003a); Egger and Engl (2005) and Hein (2005) explore issues related tostability, uniqueness and convergence rates. However, according to the authors’ knowl-edge, there is no conclusive theoretical study about the existence, uniqueness and stabilityof the inverse problem to date. Our research also assumes that an unique local volatilityfunction exists.

Bodurtha and Jermakyan (1999) also solve the inverse problem in a non-parametricspace. However, as in Lagnado and Osher (1997), the optimal volatility surface lackssmoothness. Coleman et al. (1999); Achdou and Pironneau (2005) and Turinici (2009)all solve the inverse problem by parameterizing the local volatility surface either thoughcubic splines or piecewise linear segments. By using a cubic spline interpolation toconstruct the volatility surface, the volatility surface has nice smoothness properties.Turinici (2009) proposes to calibrate the local volatility using variance of implied volatilityrather than option prices. This parametric approach, by essentially reducing the numberof parameters, works well when the selected knots can represent well enough the keyregions of true volatility surface. However, it runs the danger of allowing too few degreesof freedom to explain the data. How many knots should be placed and where shouldthey be placed might also be problem dependent.

As illustrated in the following two sections, all of the above studies except (Colemanet al. (1999)) solve the inverse problem by minimizing a cost function which measuresthe misfit between model output and observed prices together with a Tikhonov regular-ization. To successfully carry out the process, there are two key factors required: thegradient of cost function and the parameter of Tikhonov regularization. Gradient basedoptimization routine is mostly used to carried out the minimization. To date most re-search papers compute the gradient by first deriving the analytical adjoint equations andthen discretizing and solving them numerically, such as Jiang et al. (2003); Egger andEngl (2005); Turinici (2009). However, the gradient generated by this approach can beinconsistent with the true gradient, which arises from the process of discretized approxi-mation of the analytical adjoint equations (Giering (2000)). It is also infeasible when theanalytical adjoint model is not available, for example, when the model is complicated.There is also no archival paper addressing how to suitably choose the Tikhonov regular-

4

Page 5: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

ization parameter. At most, it is selected based on some ad hoc experience, such as inCrepey (2003b).

In this paper, we first propose a new method to generate the gradient of cost functionby using just the numerical code of the original model. This gradient has better numericalconsistency with the true gradient as demonstrated by the successful minimization ofthe cost function even without regularization. Second, we carried out a second orderTikhonov regularization, which was never carried out before in the context of quantitativefinance according to the authors’ knowledge, to make use of the smooth property ofvolatility surface. Third, by analyzing why ill-posedness occurs, we propose a new wayto select the Tikhonov regularization parameter. This approach turns out to be veryrobust.

This paper is arranged in the following order. First, in section 2, the mathematicalformulation of the calibration problem is set up and the complex issues related to theinverse problem are discussed in more detail and possible solutions are suggested. In sec-tion 3, we talk about how to use automatic differentiation tools to derive adjoint code tocompute gradient of cost function. In section 4, by analyzing how ill-posedness happenedfor linear inverse problems, we propose a robust way to select Tikhonov regularizationparameters. Last, in section 5, numerical results are presented.

2 Formulation of the calibration problem

2.1 Set up as a least square problem

For consistency, the notations used here are similar to the work of (Lagnado and Osher(1997)). The local volatility model assumes that the price S of an underlying follows ageneral diffusion process:

dS

S= µdt+ σ(S, t)dWt (1)

where µ is the asset return rate in a risk-neutral world, Wt is a standard Brownianmotion process, and the local volatility σ is a deterministic function that may depend onboth the price S of the underlying and time t.

Let V (S0, 0, K, T, σ) denote the theoretical price of an European option with strikeK and maturity T at time 0 for an underlying with spot price S0. Assuming the price Sfollows the stochastic process specified in equation (1), the price function V satisfies thefollowing generalized Black-Scholes PDE:

∂V

∂t+

1

2S2σ2(S, t)

∂2V

∂2S+ (r − q)S

∂V

∂S− rV = 0 (2)

where r is the risk-free continuously compounded interest rate and q is the continuous

5

Page 6: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

dividend yield of the underlying. r and q are both deterministic functions, and in thispaper we assume they are constant.

If the functional form of σ(s, t) is specified, then the price of V (S0, 0, K, T, σ) canbe uniquely determined by solving equation (2) together with appropriate initial andboundary conditions. Suppose we are given the market prices of European options(calls,puts, or both) spanning a set of expiration dates T1,. . ., TN . Assume that for eachexpiration date Ti, there are a set of options with strikes spanning from Ki1, . . ., KiMi

,where Mi represents the total number of strikes for expiration date Ti. Let V a

ij and V bij

respectively denote the bid and ask prices for an option with maturity Ti and strike Kij

at the time t = 0.

The calibration of the local volatility surface to the market is to find a local volatilityfunction σ(s, t) such that the solution of (2) is located between the corresponding bidand ask prices for any option(Kij, Ti), i.e.,

V bij ≤ V (S0, 0, Kij, Tj, σ) ≤ V a

ij

for i = 1, . . . , N and j = 1, . . . ,Mi.

This problem is usually solved by solving the following optimization problem:

minσ(s, t)

G(σ) =N

Σi=1

Mi

Σj=1

[V (S0, 0, Kij, Ti, σ)− V ij

w(i, j)]2 (3)

where V ij = (V bij + V a

ij)/2 is the mean of the bid and ask prices. w(i, j) is a scalingfactor reflecting the relative importance of different options. In this paper, we assumew(i, j) is 1 for all options, which means we assume that every option available is equallyimportant. However, the calibration techniques introduced in this paper can easily caterto non-constant scaling cases. This cost functional G(σ) reasonably quantifies the misfitbetween model predicted option prices and market observed option prices. By minimizingthis functional G, the model prediction would best fit the market.

In the above minimization problem, we need to solve PDE (2) once for each optionprice V (S0, 0, Kij, Ti, σ). Instead of solving PDE (2) a number of times, we can usethe Dupire equation, the dual of Black-Scholes equation, to solve for the options pricesV (S0, 0, Kij, Ti, σ) for all Ti and strike Kij at one time.

The Dupire equation establishes the option prices as a function of strike k and matu-rity τ for a fixed underlying price S0 at reference time t = 0. Let C(k, τ) = C(S0, 0, k, τ)be the price of an European call option at strike k and maturity τ . Then C(k, τ) satisfiesthe following Dupire equation:

∂C

∂τ− 1

2k2σ2(k, τ)

∂2C

∂2k+ (r − q)k

∂C

∂k+ qC = 0 (4)

with boundary and initial conditions for European call options given as:

6

Page 7: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

C(k, 0) = (S0 − k)+ k ∈ (0, k)C(0, τ) = S0e

−qτ τ ∈ (0, τ ]C(k, τ) = 0 τ ∈ (0, τ ]

where k and τ represent the upper boundary of our computational domain in k and τdirection respectively. r, q are still the deterministic continuously compounded interestrate and dividend yield respectively.

Please see the study of (Dupire (1994)) for the derivation of Dupire equation. Observ-ing the similarity between the Dupire equation and Black-Scholes equation, the numericalcode for solving the Black-Scholes equation just needs to be slightly modified to solvethe Dupire equation.

2.2 Issues and proposed solution

Before trying to solve problem (3), we first point out some aspects of problem (3) thatmake it complicated. (3) is a large scale nonlinear inverse problem. First, to solveequation (4), we discretize the computation domain into Nx * Nt grid points. To estimatea volatility surface σ(k, τ) that best fits market prices means we need to estimate σ at eachgrid point. The total number of parameters to estimate is thus Nt*(Nx-1) consideringno volatility is needed at the boundaries. Although as other archival material as wellas our research demonstrated, only the section of volatility surface near the money canbe estimated from market prices, the number of parameters to estimate is still quitelarge. So this is a large scale problem. Second, although the Dupire or Black-Scholesequation is a linear operator on option price V when the volatility σ is independent ofV , it is a nonlinear operator on σ or σ2. Third, as for most inverse problems, it is ill-posed in the sense that small changes in the options prices may lead to big changes in thevolatility surface. But the ill-posedness does not imply that we can not extract meaningfulinformation about the volatility surface from the option prices. Regularization is typicallythe tool introduced to control the ill-posedness.

Most research papers on the calibration of local volatility models used gradient-basedoptimization routines to solve problem(3). The gradient is typically computed from theadjoint equation of either(2) or (4). That is due to the fact that when volatility surface isnot parametrized the dimension of the gradient is so huge that it renders approximatingthe gradient by finite differences computationally very expensive.

In this paper, we propose a new way to derive the gradient of function G in (3)using just the codes for solving model (4) together with application of free automaticdifferentiation tools. This method has several benefits. First, it is not necessary toderive the theoretical adjoint equation of the model. Only the code for solving a modelis needed; this needs to be constructed in any case. This benefit makes it a model freeapproach to compute gradient of cost function in the form of (3) for any model. It willbe a good alternative for complex models when their theoretical adjoint models cannotbe easily derived. Second, there are a number of automatic differentiation tools available

7

Page 8: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

written in different computer languages that can be used to speed up the process ofconstructing the codes to compute the gradient. Third, the gradient derived using thisapproach possesses better numerical consistency with the true gradient than the gradientcomputed from the continuous adjoint model. This approach will be addressed in thenext section.

3 Using automatic differentiation to derive the gra-

dient of cost function

3.1 Description of adjoint model

The adjoint method has recently gained popularity in the quantitative finance field . Forexample, Giles and Glassman (2005), Capriotti and Giles (2010) used it to speed upthe calculation of Greeks. This paper addresses the application of adjoint methods foroptimal parameter estimation.

We shall establish here the relationship between the gradient of a cost function in theform of (3) and its adjoint model in a very general framework to show this relationshipis independent of the model used. Here we use a derivation similar to (Giering (2000)).Consider a general dynamical system and a model describing this system. Assuming themodel is in the form of

F : Rn → Rm

X → Y

where X ∈ Rn is the input or control parameters of the model, Y ∈ Rm is the output ofthe model corresponding to input X

Let Y ∈ Rm be a set of observations of the system output and suppose that the modelcan compute the values Y ∈ Rm corresponding to these observations.

By selecting an appropriate inner product (., .), we can measure the misfit betweenobservations and computed model output by introducing a cost function:

J =1

2(Y − Y,Y − Y)

or

J(X) =1

2(F (X)− Y, F (X)− Y) (5)

By finding the minimum of this cost function, we are looking for input or controlparameters X that best fit the model forecast with the observations.

To find the minimum of the cost function J , the gradient of J with respect to X isusually needed.

8

Page 9: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

If we use Taylor expansion on the cost function

J(X) = J(Xi) + (OJ(Xi),X−Xi) + o(| X−Xi |)

And we neglect the higher order terms, we have

δJ = (OJ(Xi),X−Xi) = (OJ(Xi), δXi) (6)

Now let’s suppose F is sufficiently smooth, then for a small perturbation δXi at Xi,we can linearly approximate the variation in Y by

δY = (A(Xi), δXi) (7)

where A is the Jacobian of F (X) at Xi, which is also usually called the tangent linearmodel of F . The tangent linear model of F is a model that will compute the linearapproximation of δY given a perturbation of δX at X. Unless the model F is severelynonlinear, the tangent linear model is usually a good approximation of δY when δX issmall relative to X. Since this model will be implemented in computer codes, we will talkabout the codes for this tangent linear model, which we shall refer to as tangent linearcode. Similarly, we will refer to the code of adjoint model of F as the adjoint code.

From (5), the variation of cost function J around Xi is:

δJ = (δY, F (Xi)− Y) = (A(Xi)δXi, F (Xi)− Y)

Using definition of adjoint operator, we have

δJ = (A(Xi)δXi, F (Xi)− Y) = (δXi,A∗(Xi)F (Xi)− Y) = (A∗(Xi)(F (Xi)− Y), δXi)

(8)where the operator A∗ is the adjoint of linear operator A. A∗ is also called the adjointmodel of F . In discrete case, A is the Jacobian Matrix, A∗ is just the transpose ofJacobian matrix A for real numbers.

Comparing (8) with (6), we have:

OJ(Xi) = A∗(Xi)(F (Xi)− Y) = A∗(Xi)(Y(Xi)− Y) (9)

Equation (9) establishes that we can compute the gradient of cost function J(X)using adjoint model of F . But why do we want to use the adjoint model to computethe gradient? The reason is that when the dimension of the input X is really large, wewill have to run n + 1 times the model F if we were to use finite difference to computethe gradient of cost function J(X). This becomes computationally very expensive whenthe model F is large and complicated. By using (9) we can just run the adjoint modelonly once to compute the gradient of cost function J(X). Griewank (1989) shows thatthe required numerical operations take only 2−5 times the computation required for thecost function.

9

Page 10: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

3.2 Derivation of adjoint code using automatic differentiation

A complete detailed discussion of the rationale of automatic differentiation is beyond thescope of this paper. See Giering and Kaminski (1998) for details. Here we will just listsome main aspects of automatic differentiation and some of the resources available. Thereare a few automatic differentiation tools available whose details are to be found on thewebsite www.autodif.org for numerical codes written in either C, Fortran or Matlab, suchas TAPENADE, TAMC, ADIFOR to cite but a few. Automatic differentiation is basedon the idea of the chain rule. A numerical model is an algorithm that can be viewed asa composition of differentiable functions F assuming non-differential points will not beincluded, each represented by either a statement or a subroutine in the numerical code.Automatic differentiation computes the derivative of each statement or subroutine andthen combines them together. There are some automatic differentiation tools which willgive warnings when a non-differential point occurs, such as ADIFOR.

There are two modes in automatic differentiation: the forward mode and the adjointor reverse mode. The forward mode computes the derivatives in a top-down approachwhile the adjoint mode computes the derivatives in a bottom-up approach. Feeding thenumerical source code of model F with inputX and outputY to automatic differentiationtools, they will generate the tangent linear code of model F in forward mode whilegenerate the adjoint code of F in reverse mode.

As pointed out in (7), the output of the tangent linear mode is actually δY = AδXrather than the Jacobian matrix A. By letting δX to be an unit vector with 1 at theith component and 0 at all other components, we can use the tangent linear model tocompute the ith column of matrix A. Iteratively we can find all the n columns of theJacobian Matrix A. This approach like the finite difference approach is not the best wayto compute the Jacobian matrix A when n, the number of columns of A is much greaterits number of rows m.

In the application of this paper, the Jacobian Matrix A is not required explicitly.Instead, we just need a matrix and vector product A∗(Xi)(Y(Xi) − Y) as described in(9). However, adjoint code can be used to compute the Jacobian matrix A. By feedingthe adjoint code with the jth column of an identity matrix, we will be able to computethe jth row of the Jacobian matrix A. The number of runs for the adjoint model wouldbe m times, which makes it a better choice for computing A, when n is much greater thanm. This is also the rationale behind the work of Giles and Glassman (2005), Capriottiand Giles (2010).

3.3 Verification of adjoint code

Even though the automatic differentiation tools available now are more robust now, it isstill a good practice to check whether the adjoint code generated by these automatic toolsis right or not especially for complex models. The adjoint code is tested using the twostrategies suggested by Navon et al. (1992). The first strategy is the following identity

10

Page 11: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

test and the second strategy is the gradient test.

(AQ)T (AQ) = QT (AT (AQ)) (10)

where Q represents the input of A, A represents the tangent linear code or a segmentof it, say a subroutine, a loop or even a single statement. AT is the adjoint of A. If(10) holds within machine accuracy for every segment of the tangent linear code A, itcan be said that the adjoint code is correct with respect to the tangent linear code. Inour numerical test, we checked the adjoint code segment by segment, loop by loop, andsubroutine by subroutine. With double precision, the identity (10) is always accuratewithin 13 digits or better. This verifies the correctness of the adjoint code against thetangent linear code.

3.3.1 Test of accuracy of the Tangent Linear Model(TLM)

Test (10) makes use of the tangent linear code to check the correctness of the adjointcode. The tangent linear code generated by automatic differentiation tools has muchhigher chance of being right than for adjoint code. From the authors’ own experience,the tangent linear code is right most of the time. But since the tangent linear modeldepends on the linearization assumption of model F , this assumption needs to be checked.The following test will be used to check both the validity of this assumption and also thecorrectness of tangent linear code. The accuracy of tangent linear model determines boththe accuracy of the adjoint model and also the accuracy of the gradient of cost functionwith respect to the control variables, which all lies in the linearization assumption.

To verify A, we use the fact that A is linearization of the model F :

F (X+ α ∗ δX)− F (X) = A(α ∗ δX) +O(α2)

where δX is a small perturbation around X.

We compare the output from tangent linear code forced by a small forcing δX withthe difference of the twice model call, with and without perturbation respectively. If thelinearization holds and its code is right, then the ratio between these two should approachone as α gets close to zero, as illustrated by the following equation.

r =F (X+ α ∗ δX)− F (X)

A(α ∗ δX)= 1 +O(α) (11)

After verifying the tangent linear code, we can use (10) to check the correctness ofthe adjoint code.

3.3.2 Gradient Test

Even though both the tangent linear code and adjoint code are correct, the gradientgenerated by using the adjoint model still needs to be verified since the accuracy of the

11

Page 12: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

gradient depends not only on the accuracy of the tangent linear and adjoint model, butalso on the approximation involved in linearizing the cost function (6). The gradient canbe verified by again using the Taylor expansion.

Suppose the initial X has a perturbation αh, where α is a small scalar and h is avector of unit length (such as h = OJ‖OJ‖−1). According to Taylor expansion, we getthe cost function:

J(X + αh) = J(X) + αhTOJ(X) +O(α2)

We can define a function of α as:

Φ(α) =J(X+ αh)− J(X)

αhTOJ(X)= 1 +O(α) (12)

Here the gradient OJ(X) is calculated using the adjoint model. So as α tends to zerobut not close to machine precision, this ratio Φ should be close to 1. When α is close tomachine precision, the ratio will deviate from 1 as the machine error dominates. Figure(1) shows the gradient test of our numerical code. We can see as α is between 10−15 and10−4, Φ(α) approaches 1 with high accuracy. The gradient can be also verified indirectlyby the reduction of cost function. Figure (2) shows the decrease of cost function for S&P500 index European call options in October 1995 before any regularization. Details of theS&P 500 options are described in the second example in the numerical test section. Thecost function for the other three sets of options discussed in the numerical test sectioncan be reduced close to zero as well before any regularization is applied. However theoptimal volatility surfaces obtained are very oscillatory and unstable, see figure 3. Thisjust illustrates the under-determined and ill-posed nature of our calibration problem.Regularization is thus necessary to control both the under-determination and the ill-posedness.

12

Page 13: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

−20 −18 −16 −14 −12 −10 −8 −6 −4 −2 0−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

log10(α)

log1

0 (Φ

(α))

Figure 1: verification of the gradient calculation

13

Page 14: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0 50 100 150 200 250−5

−4

−3

−2

−1

0

1

2

iteration

log1

0(G

)

Figure 2: Reduction of the cost function without any regularization for S&P 500 index European calloptions in October 1995

14

Page 15: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

00.20.40.70.91.0

1.5

2.0

0.80.911.11.2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

T

K/S0

σ

Figure 3: The optimal local volatility surface reconstructed before applying any regularization for S&P500 index European call options in October 1995

15

Page 16: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

4 Tikhonov Regularization

4.1 Second order Tikhonov Regularization

Tikhonov Regularization is a most popular regularization method for inverse problems.Tikhonov regularization serves as a good compromise between what we can get from thesystem and the degree of errors in the system, such as observation errors, truncationerrors, discretization errors (Hansen (1998)). It usually assumes the following form.

F (σ) = G(σ) + λ ‖ Lm|σ − σ0| ‖22 (13)

where λ is the regularization parameter. σ0 is a priori estimate of σ. It is 0 if thereis no priori estimate. Lm is an operator. When m = 0, L is an the identity matrix.The regularization is called the zeroth order Tikhonov regularization. When m = 1, Lassumes the form of the first derivative of σ and it is called the first order Tikhonovregularization. As mentioned in the introduction part, most papers on the calibration oflocal volatility surfaces use the first order Tikhonov regularization, which is the following:

F (σ) = G(σ) + λ ‖ Oσ ‖22 (14)

However, the volatility surface generated by the first order Tikhonov regularization isusually rough. Assuming the volatility surface is smooth, we propose to use the followingsecond order Tikhonov regularization.

F (σ) = G(σ) + λ‖∂2σ

∂x2+

∂2σ

∂t2+

∂σ

∂t

∂σ

∂x‖22 (15)

The calibration problem now turns into a constrained minimization problem:

min0<σ(s, t)<1

F (σ) (16)

Usually a gradient based optimization routine is used to find a local minimum of F.The gradient of new cost function F is composed of both the gradient of G, which isderived from the adjoint model, and gradient of the regularization part in (15). Multi-start strategies can be applied in order to find a global minimum of F. However, in thispaper, we ignore the issue of finding the global minimum since we are just interestedin finding a trustable and smooth volatility surface that traders can use to hedge theirpositions.

Since the parameters are bounded between 0 and 1, we use LBFGSB code (Zhu et al.(1997)) to minimize the regularized cost function of (15). LBFGSB is an optimizationroutine to solve bounded or unbounded large scale minimization problems using just thegradient of cost function. It has a super-linear convergence rate yet requires only smallmemory storage since it does not store the Hessian matrix of the cost function.

16

Page 17: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

4.2 Strategy to select Tikhonov regularization parameter λ

A successful solution of the inverse problem depends critically on a suitable selectionof the regularization parameter λ. For linear inverse problems, λ is usually selectedby either the L-curve method or generalized cross validation theory, see Hansen (1998),Aster et al. (2005). Here we will adopt an iterative regularization strategy to solve thenonlinear inverse problem, in which a suitable regularization parameter is selected ateach iteration rather than stays the same throughout the minimization process as in theL-curve method. To determine how to select a suitable regularization parameter in eachiteration, we consider the following analysis.

We are actually solving for a vector M from

GM = D (17)

where G is a nonlinear model operator, M is the input or parameters characterizing amodel and D consists of observation data. In our case, G is Dupire equation (4), M is abig vector containing all the points of the volatility surface with size Nt*(Nx-1), and Dis a vector containing all available option prices.

This problem can not be directly solved. Instead, it is solved by minimizing of a costfunction of the form (5). When we use LBFGSB to iteratively find the minimum of (15)without the regularization, it is essentially equivalent to finding M iteratively using thegradient information of G. In other words, it is to find Mk+1 at iteration k+1 such that,

GMk+1 = GMk +A(δM) ≈ D (18)

where we ignore the higher terms of δM in Taylor expansion, A is the Jacobian matrixof model G.

If equation (18) is not well-posed, then the optimization routine LBFGSB will have ahigh chance of finding unstable solutions. If equation (18) is well posed, the optimizationroutine will be more likely to find stable solutions. Equation (18) is a linear problem ofthe form:

A(δM) = D−GMk (19)

Using singular value decomposition ofA, we find the solution to (19) can be expressedas:

δM = VpS−1p UT

p (D−GMk) =p

Σi=1

UT., i(D−GMk)

siV., i (20)

We can see that the inverse solution can become extremely unstable when the smallsingular values, si, are close to zero. A plot of the singular values of matrixA of one of ournumerical tests is shown in Figure 4. Figure 5 shows the singular values normalized by thebiggest singular values. The decay of the singular values is in the order of O(i−µ), whereµ ≤ 1. According to the definition in Hofmann (1986), it is a mildly ill-posed problem.

17

Page 18: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

Finding out where the ill-posedness comes from, we can regulate the ill-posedness bydiluting the effects of the small singular values si in the eigenvalue spectrum.

Thus our criteria of selecting the regularization parameter λ in each iteration is tochoose λ such that it is greater than the smallest singular values of A while smaller thanthe dominant singular values. In this way, the effect of the small singular values arereduced or eliminated while the main information represented by the dominant singularvalues are still retained. This is a feasible strategy because of the following reasons.

We are essentially using an optimization routine to find the minimum of a large scaleproblem (15). In large scale optimization, it is a good strategy to pass just the major partof the information from one iteration to the next iteration while allowing some degree offreedom to delete other components to find the global minimum, at least with a higherchance, such as the optimization by simulated annealing.

In our numerical experiment, we sort the singular values in a decreasing order, andthen take the first singular value up to which the sum of the first n singular values thataccounts for 50% of the sum of all singular values. This percentage here, which willbe referred to as the truncation level of singular values, is used to determine whichsingular value will be used as the regularization parameter. It is the only parameter thatis subject to change in our proposed numerical method. The higher this truncation level,the smaller the chosen Tikhonov regularization parameter.

To find the singular values of A, we use the package ARPACK, which is available at:www.caam.rice.edu/software/ARPACK. All that it requires is a code computing theproduct of the matrix A with a vector rather than the matrix A itself, which suitsour case perfectly. For one thing, A has a really high dimension and requires a seriousamount of resources to be stored explicitly. On the other hand , as we build the code ofadjoint model AT , whose output is actually the product of AT and its input vector. Wethus readily have a code to use ARPACK to compute the singular values of AT , whichare also the singular values of A.

18

Page 19: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0 10 20 30 40 50 60−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

log1

0(si

ngul

ar v

alue

s)

Figure 4: The singular values of the Jacobian matrix A in the first LBFGSB iteration for calibrationof the local volatility surface for S&P 500 index European call options in October 1995

19

Page 20: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

scaled singular values

i−0.2

i−1

Figure 5: Normalized singular values of the Jacobian matrix A in the first LBFGSB iteration forcalibration of the local volatility surface S&P 500 index European call options in October 1995

20

Page 21: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

5 Numerical Results

Before discussing our numerical tests, let us first summarize our algorithm:

1. Initialize volatility surface σ0(s, t)

2. Use (4) to compute option prices Vcmpt and cost function G in (3)

3. Feed the difference between Vcmpt and Vobs into adjoint model A∗ derived in section(3.2), using (9), to compute the gradient of G with respect to σ(s, t)

4. Use ARPACK to compute the singular values of Jacobian Matrix A and computethe regularization parameter λ of (15) at the truncation level 50% as describedin section (4.2)

5. Add the regularization part of (15) to G to form the regularized cost function Fof (15). Add the gradient of regularization part of (15) to the gradient obtained instep 3 to compute the gradient of cost function F with respect to σ(s, t)

6. Plug the cost function F and its gradient with respect to σ(s, t) into LBFGSBroutine to find the next estimate σk+1(s, t). k = 0, 1, 2, · · ·

7. Stop if the stopping criterion of LBFGSB is satisfied or 500 minimization iterationswere reached, whichever occurs first. Otherwise, go back to step 2.

The Dupire equation (4) is solved using backward Euler scheme in time and centraldifference scheme in space direction, respectively. The computation domain

[0 T

]×[

0 K]is set as K = 2S0 as in study of (Lagnado and Osher (1997)) while T is the

longest maturity. NX = 200, Nt =100 are used for all our four numerical tests. Thelower and upper bound for σ when running LBFGSB is set to be 0 and 1. The stoppingcriteria in LBFGSB is set as factr = 100, pgtol = 10−6, which means LBFGSB stopswhen the projected gradient is less than 10−6 or the relative reduction of f betweentwo consecutive iterations is less than factr*machine precision when f is greater than orequal to 1 or the absolute difference of f between two consecutive iterations is less thanfactr*machine precision when f is less than 1. For details, please see (Zhu et al. (1997)).The initial guess σ0 =0.15 for all four cases.

To demonstrate the robustness of our method, we start with a theoretical model asused in both Lagnado and Osher (1997) and Coleman et al. (1999). In this example, thelocal volatility function is in the form of

σ(s, t) =15

s(21)

The options prices have closed-form solutions in this case Cox and Ross (1976).Twenty two European call option prices are generated using the closed-form solutionfor two maturities T =0.5 and T =1.0 with eleven options for each maturity. Then these

21

Page 22: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

option prices are used to recover the volatility surface (21). Like in the study of Lagnadoand Osher (1997); Coleman et al. (1999), S0 =100, the risk free interest rate r =0.05,and the dividend yield q = 0.02. Figure 6 shows the recovered volatility surface andthe true volatility surface. The recovered volatility can be hardly distinguished fromthe true volatility surface, which shows an almost exact recovery of the true volatilitysurface. Figure 7 shows the relative error of computed options prices using the recoveredvolatility surface with respect to the true option prices. The relative error is in the orderof 10−3.

To demonstrate the robustness of our methods, we added noise to the true optionprices to see whether we can still recover the volatility surface. The noise is introducedby:

vi = the exact price of option i vi*(1+noise-level(0.5-rand))

where rand is an uniformly distributed random number between 0 and 1 generatedby GNU gfortran, noise-level is the percentage of noise that is added to each optionprice. In our example, we tested noise levels of 2%, 5%, and 10%. Figure 8, 9, and10 show the recovered volatility surfaces under these three noise-levels compared to thetrue volatility surface. We can see that when the noise level is low, like 2%, the optimalvolatility surface recovered still approximates the true volatility surface very well. Weuse equation (22) to measure the relative error of optimal volatility surface compared tothe true volatility surface. The relative error for 2% noise is 6.5%. Even when the noiselevel is high, like 5% or 10%, the optimal volatility still reasonably approximates the truevolatility surface although the deviation from the true volatility surface is higher thanin the low noise level case. The relative errors calculated using (22) are 12% and 19%respectively for noise levels 5% and 10%.

r =‖σtrue − σoptimal‖2

‖σtrue‖2(22)

Table 1 shows the absolute relative error of some Greeks computed using the optimalvolatility surface with respect to the Greeks computed using the true volatility surface.Delta and Rho are computed using the forward mode of automatic differentiation, whichwill compute the exact value of the Greeks. Vega is approximately computed by finitedifferences as in (Coleman et al. (1999)). A constant perturbation of both volatilitysurfaces is used to compute the relative error. We can see that when the noise level islow, the mean absolute relative error is less than 1.5% for all three Greeks. The meanabsolute relative error for Rho is as small as 0.2%. Even when then noise level is high like10%, the Greeks reconstructed still approximate the true Greeks fairly well with the meanabsolute relative error is less than 5% for all three Greeks. Gamma is not computed heresince Gamma would be zero for both cases due to the setup of the boundary conditionsof (4). The high degree of approximation of Greeks even when the option prices arecontaminated with noise demonstrates further the robustness of our calibration method.This robustness may mean a better hedge for traders.

22

Page 23: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

The robustness of our calibration method even for noisy option prices is very differentfrom the findings of Coleman et al. (1999), where a parametric volatility surface spannedby cubic splines is solved. In Coleman et al. (1999), it is argued that when the numberof interpolation knots exceed the number of options available, even a small amount ofnoise will render the optimal volatility surface invalid. Our method however is robustnot only with noisy data but also with an increasing number of degrees of freedom. It istested that refining the mesh grids, i.e., adding more parameters or degrees of freedom,does not reduce the effectiveness of this method. The general shape of optimal volatilitysurface stays the same when Nx and NT increase. However, NX, NT should be largeenough to make sure this would happen.

23

Page 24: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

noise level Greeks max |relative error| mean |relative error| min |relative error|Delta 1.3% 0.7 % 0.2 %

2 % Vega 5.2 % 1.3 % 0.01 %Rho 1.0 % 0.2 % 0.02 %Delta 3.4% 2.2 % 0.7 %

5 % Vega 12 % 3.0 % 0.3 %Rho 2.3 % 0.5 % 0.02 %Delta 6.6% 4.7 % 1.8 %

10 % Vega 21 % 4.6 % 0.6 %Rho 4.0 % 1.2 % 0.02 %

Table 1: The absolute relative error of Greeks computed using volatility surface reconstructed fromnoisy prices compared to Greeks computed from true volatility surface for volatility model σ(s, t) = 15

S

24

Page 25: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

The second example of our method is another benchmark test as in Coleman et al.(1999); Andersen and Brotherton-Ratcliffe (1998) and Turinici (2009). The options areEuropean Call options on S&P 500 index in October 1995. There are a total of 57 optionswith seven maturities. The initial index, interest rate, and dividend yield are provided inthe footnotes of Figure 11. Figure 11 shows the optimal volatility obtained. Comparedto other studies, our volatility surface not only has the best smoothness but also is ina reasonable range between 0.1 and 0.35, which is not the case in other studies such asColeman et al. (1999). It also has a nice skew structure that agrees with the statement byHull (2009) that traders use a skewed volatility to price European stock index options.The relative errors of computed prices with respect to observed prices are plotted inFigure 12. The relative errors are mostly close to zero except for options whose pricesare close to zero and maturities occur sooner. This can be accepted because options withnearly zero prices allow much higher degree of relative difference between bid and ask,or in other words, allow much higher degree of approximation error. The mean absoluterelative error is 4.6%. Excluding the five options with big absolute relative errors, themean absolute relative error is as small as 0.27%.

The last two examples are concerned with European call options in the foreign ex-change market. The first one was studied by both Avellaneda et al. (1997) and Turinici(2009). They are 15 European call options for the US dollar/Deutsche mark with 5 ma-turities, which are computed from 20, 25 and 50 delta risk-reversals quoted on Aug 23,1995. The spot price and interest rates are shown in the footnotes of Figure 13. Theoptimal volatility surface and relative error of computed price are shown in Figure 13 and14 respectively. The volatility surface has a nice smile shape as expected for Europeanoptions on foreign exchange rates. The mean absolute relative error is 1.8%.

The last example is about the European call options for the euro/US dollar datedMar 18, 2008 as in study (Turinici (2009)). There are a total of 30 options with 6maturities. The option prices are computed from quoted 10, 25, and 50 delta risk-reversal and strangles. The spot rate and interest rates for each currency are listed onfootnotes of Figure 15. As Figure 15 demonstrates, the volatility surface exhibits a sortof bell shape structure in the long run. It also has some short term structures. As thelocation of maturities shows, the near term volatility structure has something to do withclustering of options of short maturities. Figure 16 shows the relative error of computedprice with respect to observed prices. Again, big relative error occurs when the optionprices are close to zero and maturities are soon. The mean absolute relative error is 6%.But the mean absolute relative error is as small as 0.9% when the short term options andalso options with nearly zero prices are not included.

5.1 Truncation level, scaling and computation time

The only parameter that is subject to change in our algorithm is the truncation level.But it is fixed at 50% thorough our four tests. Other truncation levels are also tested. Thehigher the truncation level, the more oscillatory of the volatility surface. Since in this casethe Tikhonov regularization parameter is small then some noises due to discretization

25

Page 26: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

or market noise are not efficiently filtered out. The lower the truncation level, the morestable and smooth of the volatility surface but at the risk of increasing the relative error.The relative error and the general shape of the optimal volatility surface actually donot change too much overall when the truncation level is less than 0.9, which meansthis method is fairly robust with different truncation levels as long as the regularizationparameter is not selected such that those singular values close to the smallest singularvalue in the spectrum are retained. This point can be also implied from the fact that weuse a fixed truncation level for all of our four numerical tests. The author suggests that(0, 0.9) is a good range for the truncation level.

For interest rate options, it is a good practice to scale the spot price to be 100 andalso scale the option prices accordingly. Because the prices of interest rate options aresmall, after squaring them in the cost function, they become too small to be accountedfor. After scaling the spot rate S0 to 100, the relative errors are reduced. When thespot price is large, for example the stock index option with S0 = 590, there is neithersignificant improvement nor worsening of the relative error by scaling S0 to 100. However,we find it a good practice to carry out scaling since in this way we just need to pre-processthe spot price and option prices for any kind of European options on any underlying suchas equity index, foreign exchange without changing anything in the numerical code torecover the optimal volatility surface.

For all the last three numerical tests, the computation time is 166, 12, 59 secondsrespectively using a Dell Vostro 1720 with Intel Core Duo CPU @2.2G HZ and 2GBRAM. For the first numerical test, to reach high accuracy accuracy, no limit to maximumnumber of iterations was imposed, the computation time is 407 seconds using the samecomputer.

6 Conclusion

Our research solves the calibration of the local volatility surface for European optionsin a non-parametric way. We propose a new way to use the automatic differentiationtool TAPENADE to develop the adjoint model of the Dupire equation, which is thenused to compute the gradient of cost function. The gradient generated in this way isnumerically more consistent with the true gradient of cost function than the continuousadjoint approach used in most research papers so far. We also first propose, accordingto the best of the authors’ knowledge, to use a second order Tikhonov regularization toregularize the calibration problem. Additionally we propose an efficient way to choosethe Tikhonov regularization parameter by exploring the causes of ill-posedness. It isselected as a suitable singular value of the Jacobian matrix that is implicitly used duringthe optimization process.

Our method is robust as evidenced by its successful recovery of the volatility surfaceof a theoretical model even when the option prices are contaminated with noise. This is abig improvement compared to other published papers so far according to the best of theauthors’ knowledge. The robustness of our method is also validated by the reasonable

26

Page 27: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

volatility surfaces recovered from the three real world examples that have been studiedin other papers. Furthermore, although the we are dealing with a large scale problem,the computation time is so short that it can be used for real time applications. This isbecause both the LBFGSB routine and ARPACK packages that we employ can save timeand storage requirements. Last, although our numerical tests are carried out exclusivelyon European call options, the method introduced should also work for European putoptions or a combination of both. The calibration techniques proposed is developed ina very general framework. It can be generalized to explore calibration of other volatilitymodels or calibration with respect to other options.

27

Page 28: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0

0.5

1

0.80.911.11.2

0.12

0.13

0.14

0.15

0.16

0.17

0.18

0.19

TK/S0

σ

Figure 6: The true volatility surface and optimal volatility surface for volatility model σ(s, t) = 15S

as studied in (Lagnado and Osher (1997) and Coleman et al. (1999)) using 22 options prices computedfrom (Cox and Ross (1976)).

28

Page 29: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0 5 10 15 20 250

5

10

15

optio

n pr

ices

(Vob

s)

0 5 10 15 20 25−2

0

2

4x 10

−3

rela

tive

erro

r : r

= (V

cmpt

−V

obs)

/Vob

s

relative error

option prices

Figure 7: Left: The true options prices. Right: The relative error of the computed options pricesusing optimal volatility surface with respect to the true prices for model σ(s, t) = 15

S . Option prices areplotted in an increasing order of maturity date. Option prices with the same maturity look like on thesame curve. S0 =100, the risk free interest rate r =0.05 and the dividend yield q = 0.02.

29

Page 30: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0

0.5

1

0.80.9

11.1

1.2

0.1

0.12

0.14

0.16

0.18

0.2

T

K/S0

σ

Figure 8: The optimal volatility surface compared to the true volatility surface when 2% uniform noiseis added to the true option prices for volatility model σ(s, t) = 15

S

30

Page 31: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0

0.5

1

0.80.9

11.1

1.2

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

T

K/S0

σ

Figure 9: The optimal volatility surface compared to the true volatility surface when 5% uniform noiseis added to the true option prices for volatility model σ(s, t) = 15

S

31

Page 32: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0

0.5

1

0.80.911.11.2

0

0.05

0.1

0.15

0.2

0.25

TK/S0

σ

Figure 10: The optimal volatility surface compared to the true volatility surface when 10% uniformnoise is added to the true option prices for volatility model σ(s, t) = 15

S

32

Page 33: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

00.2

0.40.7

0.91.0

1.5

2.0

0.80.9

11.1

1.2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

TK/S0

σ

Figure 11: The optimal volatility surface obtained for S&P 500 index European call options in October1995. S0 =$ 590, r=0.06, q=0.0262. Note: the available maturities are plotted on the T axis in unit ofyears

33

Page 34: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0 10 20 30 40 50 600

50

100

150

optio

n pr

ices

(Vob

s)

0 10 20 30 40 50 60−1

−0.5

0

0.5

rela

tive

erro

r : (V

cmpt

−V

obs)

/Vob

s

relative erroroption prices

Figure 12: Left: The prices of S&P 500 index European call options in October 1995 (Andersenand Brotherton-Ratcliffe (1998)). Right: The relative errors of computed option prices with respect toobserved price. Option prices are plotted in an increasing order of maturities. Option prices with thesame maturity look like on the same curve.

34

Page 35: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0 0.08

0.160.24

0.49

0.74

0.8

0.9

1

1.1

1.2

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

TK/S0

σ

Figure 13: The optimal volatility surface obtained for European call options of US dollar/ Deutschemark rate. The spot price was S0 =1.48875; US dollar interest rate was rUS = 5.91%; Deutsch markrate was rDeutschmark= 4.27%. Note: the available maturities are plotted on the T axis in unit of years

35

Page 36: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0 5 10 150

0.01

0.02

0.03

0.04

0.05

0.06

optio

n pr

ices

(Vob

s)

0 5 10 15−0.06

−0.04

−0.02

0

0.02

0.04

0.06

rela

tive

erro

r : r

= (V

cmpt

−V

obs)

/Vob

s

relative erroroption prices

Figure 14: Left: The prices of European call options on US dollar/Deutsche mark rate recovered from20, 25, and 50 delta risk reversals (Avellaneda et al. (1997)). Right: The relative error of computedoption price with respect to observed price. Option prices are plotted in an increasing order of maturitydate. Option prices with the same maturity look like on the same curve.

36

Page 37: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

00.020.080.16 0.25

0.50

1.0

0.8

0.9

1

1.1

1.2

0

0.05

0.1

0.15

0.2

TK/S0

σ

Figure 15: The optimal volatility surface obtained for European call options on euro/US dollar rateon Mar 18, 2008. The spot price was 1.5755; US dollar interest rate was rUSD = 2.485%; euro interestrate was rEUR =4.550%

37

Page 38: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

0 5 10 15 20 25 300

0.2

0.4

optio

n pr

ices

(Vob

s)

0 5 10 15 20 25 30−0.5

0

0.5

Rel

ativ

e er

ror :

r =

(Vcm

pt −

Vob

s)/V

obs

option pricesrelative error

Figure 16: Left: The prices of European call options on euro/US dollar rate on Mar 18, 2008 recoveredfrom quoted 10, 25, and 50 delta risk-reversals and straddles.(Turinici (2009)). Right: The relative errorof computed option price with respect to recovered price. Option prices are plotted in an increasingorder of maturity date. Option prices with the same maturity look like on the same curve.

38

Page 39: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

References

Achdou, Y. and Pironneau, O. (2005), Computational Methods for Option Pricing, Soci-ety for Industrial and Applied Mathematics(SIAM), Philadelphia.

Andersen, L. and Brotherton-Ratcliffe, R. (1998), “The equity option volatility smile: animplicit finite difference approach”, The Journal of Computational Finance , Vol. 1,pp. 37–64.

Aster, R., Borchers, B. and Thurber, C. (2005), Parameter Estimation and Invese Prob-lems, Elesvier Academic Press, Burlington.

Avellaneda, M., Fridman, C., Holems, R. and Samperi, D. (1997), “Calibrating volatilitysurface via relative entropy minimization”, Applied Mathematical Finance , Vol. 4,pp. 667–686.

Black, F. and Scholes, M. (1973), “The pricing of options and corporate liabilities”,Journal of Political Economy , Vol. 81, pp. 637–659.

Bodurtha, Jr, J. and Jermakyan, M. (1999), “Nonparametric estimation of an impliedvolatility surface”, The Journal of Computational Finance , Vol. 2, pp. 29–61.

Bouchouev, I. and Isakov, V. (1997), “The inverse problem of option pricing”, InverseProblems , Vol. 13, pp. L11–L17.

Bouchouev, I. and Isakov, V. (1999), “Uniqueness,stability and numerical methods for theinverse problem that arises in financial markets”, Inverse Problems , Vol. 15, pp. R95–116.

Capriotti, L. and Giles, M. (2010), “Fast correlation greeks by adjoint algorithmic differ-entiation”, Risk , Vol. 23, pp. 79–83.

Coleman, T., Li, Y. and Verma, A. (1999), “Reconstructing the unknown local volatilityfunction”, The Journal of Computational Finance , Vol. 2, pp. 77–100.

Cox, J. and Ross, S. (1976), “The valuation of options for alternative stochastic pro-cesses”, Journal of Financial Economics , Vol. 3, pp. 145–166.

Crepey, S. (2003a), “Calibration of the local volatility in a generalized Black-Scholesmodel using Tikhonov regularization”, Journal of Mathematical Analysis on SIAM ,Vol. 34, pp. 1183–1206.

Crepey, S. (2003b), “Calibration of the local volatility in a trinomial tree using Tikhonovregularization”, Inverse Problems , Vol. 19, pp. 91–127.

Crepey, S. (2004), “Delta-hedging vega risk”, Quantitative Finance , Vol. 4, pp. 559–579.

Derman, E. and Kani, I. (1994), “Riding on a smile”, Risk , Vol. 7, pp. 32–39.

Dupire, B. (1994), “Pricing with a smile”, Risk , Vol. 7, pp. 18–20.

39

Page 40: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

Egger, H. and Engl, H. (2005), “Tikhonov regularization applied to the inverse problem ofoption pricing: convergence analysis and rates”, Inverse Problems , Vol. 21, pp. 1027–1045.

Gatheral, J. (2006), The volatility surface: a practitioner’s guide, John Wiley & Sons,Inc, New Jersey.

Giering, R. (2000), Tangent linear and adjoint biogeochemical models, in P. N. R. D.P.Kasibhatla, M.Heimann, ed., ‘Inverse methods in global biogeochemical cycles’,American Geophysical Union, Washington DC, pp. 33–47.

Giering, R. and Kaminski, T. (1998), “Recipes for adjoint code construction”, ACM onTransactions on Mathematical Software , Vol. 24, pp. 437–474.

Giles, M. and Glassman, P. (2005), “Smoking adjoints: Fast calculation of greeks inmonte carlo calculation”, Technical Report , Vol. NA-05/15.

Griewank, A. (1989), On automatic differentiation, in M.IRI and K.TANABE, eds,‘Mathematical programming: Recent Developments and Applications’, Kluwer Aca-demic Publishers, Dordrecht, pp. 83–108.

Hansen, P. (1998), Rank-Deficient and Discrete Ill-Posed Problems: numerical aspects oflinear inversion, Society for Industrial and Applied Mathematics(SIAM), Philadelphia,chapter 1.

Hein, T. (2005), “Some analysis of Tikhonov regularization for the inverse problem ofoption pricing in the price dependent case”, Journal for Analysis and its Applications, Vol. 24, pp. 593–609.

Heston, S. (1993), “A closed-form solution for options with stochastic volatility withapplication to bond and currency options”, The Review of Financial Studies , Vol. 6,pp. 327–343.

Hofmann, B. (1986), Regularization for Applied Inverse and Ill-Posed Problems, Teubner,Stuttgart,Germany.

Hull, J. (2009), Options, futures, and other derivatives, Pearson Education, New Jersey.

Jiang, L., Chen, Q., Wang, L. and Zhang, J. (2003), “A new well-posed algorithm torecover implied local volatility”, Quantitative Finance , Vol. 3, pp. 451–457.

Jiang, L. and Tao, Y. (2001), “Identifying the volatility of underlying assets from optionprices”, Inverse Problems , Vol. 17, pp. 137–155.

Lagnado, R. and Osher, S. (1997), “Reconciling difference”, Risk , Vol. 10, pp. 79–83.

Merton, R. (1976), “Option pricing when underlying stock returns are discontinuous”,Journal of Financial Economics , Vol. 3, pp. 125–44.

40

Page 41: Non-parametric calibration of the local volatility surface ...inavon/pubs/localpaper2.pdf · tain adjoint code of the direct model is employed as an e cient way to obtain the gradient

Navon, I., Zou, X., Derber, J. and Sela, J. (1992), “Variational data assimilation withan adiabatic version of the nmc spectral model”, Monthly Weather Review , Vol. 120,pp. 1433–1446.

Rubinstein, M. (1994), “Implied binomial tress”, Journal of Finance , Vol. 69, pp. 771–818.

Turinici, G. (2009), “Calibration of local volatility using the local and implied instanta-neous variance”, The Journal of Computational Finance , Vol. 13, pp. 1–18.

Zhu, C., Byrd, R. and Nocedal, J. (1997), “L-bfgs-b: Algorithm 778: L-bfgs-b, for-tran routines for large scale bound constrained optimization”, ACM Transactions onMathematical Software , Vol. 23, pp. 550–560.

41