25
Statistical Methods in Finance and other elds Michael Pitt Kings College London Cumberland Lodge February 2016 (Cumberland Lodge ) February 2016 1 / 25

Statistical Methods in Finance and other –elds...Statistical methods in Finance and Related Fields Problems which arise in –nance. Main examples are volatility and portfolio allocation

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Statistical Methods in Finance and other fields

    Michael PittKing’s College London

    Cumberland Lodge

    February 2016

    (Cumberland Lodge ) February 2016 1 / 25

  • Statistical methods in Finance and Related Fields

    Problems which arise in finance.

    Main examples are volatility and portfolio allocation.

    Similar problems which arise elsewhere

    Bearings-only tracking problem.

    Main techniques for estimating off-line; MCMC.

    Main techniques for estimating on-line; Particle filters.

    (Cumberland Lodge ) February 2016 2 / 25

  • Exchange rate return data

    yt = 100× log(St/St−1).

    85 90 95

    02.5

    Daily returns, Pound (P)

    85 90 95

    -2.50

    2.5Daily returns, DM

    85 90 95

    -2.5

    0

    2.5Daily returns, Yen

    85 90 95

    0

    10

    Daily returns, SF

    85 90 95

    0

    5

    Daily returns, FF

    0 5 10 15 20 25

    0

    .05

    Correlogram of daily returnsP DMYen SFFF

    0 50 100 150 200 250

    0

    .1

    .2 Correlogram of absolute value of returns|P| |DM||Yen | |SF||FF|

    0 100 200 300 400 500

    5

    10

    15 Partial sum of corr of abs returns|P| |DM||Yen | |SF||FF|

    (Cumberland Lodge ) February 2016 3 / 25

  • S&P500 returns

    yt = 100× log(St/St−1).

    R e turns

    5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 1 0 0 0

    -5

    0

    5 R e turns Quantile s of filte re d s d

    5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 1 0 0 0

    1

    2

    3Quantile s of filte re d s d

    QQ-plot of distribution functions

    0 .0 0 .2 0 .4 0 .6 0 .8 1 .0

    0 .5

    1 .0QQ-plot of distribution functionsACF-distribution functions

    0 5 1 0

    0

    1ACF-distribution functions

    Figure: S&P500 continuously compounded returns (in %). TOP RIGHT: Thefiltered quantiles of the standard deviation. BOTTOM: Diagnostics throughpredictive distribution function.(Cumberland Lodge ) February 2016 4 / 25

  • Stylized facts:

    Returns yt are uncorrelated.

    Heavy tailed.

    Exhibit slowly changing variance.

    We want the models to exhibit these key features.

    (Cumberland Lodge ) February 2016 5 / 25

  • Stochastic volatility model (discrete time):

    yt = log St − log St−1 = exp(αt/2)εt ∼ N{0; exp(αt )}αt = µ+ ρ(αt−1 − µ) + σut ,

    where ut and εt are independent Gaussian terms.

    Model is a 3 parameters model θ = (µ, ρ, σ).

    Simple to work out the properties (autocorrelations, marginalmoments etc..)

    Can be regarded as a discretization of a continuous time process.

    Harder to estimate!

    (Cumberland Lodge ) February 2016 6 / 25

  • Stochastic volatility model (continuous time):

    d log S(u) = exp{x(u)/2}dB1(u)

    log S(t)− log S(t − 1) ∼ N{0;∫ tt−1

    exp{x(u)}du}

    where

    σ2∗t =∫ tt−1

    exp{x(u)}du

    is the integrated volatility over the dayWe can assume a general diffusions on x(u)

    dx(u) = a{x(u)}du+b{x(u)}dB2(u)

    For the simulation based inference we can use an Euler approximation(divide the day into short strips).Particle filters are the only way to estimate such models!

    (Cumberland Lodge ) February 2016 7 / 25

  • A directed acyclic graph (DAG) of the problem:

    (Cumberland Lodge ) February 2016 8 / 25

  • Tracking (bearings-only):

    (Cumberland Lodge ) February 2016 9 / 25

  • Tracking (bearings-only):

    zt = tan−1(ytxt

    )+ εt

    ytvytxtvxt

    = T

    yt−1vyt−1xt−1vxt−1

    +

    0uyt0uxt

    ,uyt , u

    xt are the random noise terms (accelerations).

    (Cumberland Lodge ) February 2016 10 / 25

  • Offl ine problem:

    We have all the data to hand y1, ..., yTSimply wish to conduct Bayesian inference.

    Prior distribution of density p (θ) .

    Bayesian inference relies on the posterior

    π (θ) = p ( θ| y) = p (y ; θ) p (θ)∫Θ p(y ; θ′

    )p(θ′)dθ′.

    (Cumberland Lodge ) February 2016 11 / 25

  • Offl ine problem:

    For some (conjugate) models this is easy.

    Examples include the linear model (regression) and the exponentialfamily.

    ytiid∼ Bernoulli(θ), θ ∼ Beta(α, β)

    θ|y1:t ∼ Beta(

    α+t∑i=1yi , β+ t

    )E [θ|y1:t ] =

    α+∑ti=1 yiβ+ t

    → y .

    Our models are not like that (unfortunately!)

    (Cumberland Lodge ) February 2016 12 / 25

  • Offl ine problem:

    We have an expanded space π (θ, x) = p (x , θ| y) .We can use MCMC (Markov chain Monte Carlo) to generate a(reversible) Markov chain with π (θ, x) as the invariant distribution.

    Q(θj+1, xj+1|θj , xj )π (θj , xj ) = Q(θj , xj |θj+1, xj+1)π (θj+1, xj+1) .

    If we are only interested in θ then (this is key):

    π (θ) =∫

    π (θ, x) dx .

    So we only need to record the θj path

    (Cumberland Lodge ) February 2016 13 / 25

  • A Nonlinear State-Space Model

    Standard non-linear model

    Xt = 12Xt−1 + 25Xt−11+X 2t−1

    + 8 cos(1.2t) + Vt , Vti.i.d.∼ N

    (0, σ2V

    ),

    Yt = 120X2t +Wt , Wt

    i.i.d.∼ N(0, σ2W

    ).

    T = 200 data points with θ =(σ2V , σ

    2W

    )= (10, 10).

    Diffi cult to perform standard MCMC as p (x1:T | y1:T , θ) is highlymultimodal.

    We sample from p ( θ| y1:T ) using a random walk pseudo-marginalMH where p (y1:T ; θ) is estimated using SMC with N particles.

    (Cumberland Lodge ) February 2016 14 / 25

  • A Nonlinear State-Space Model: MCMC path

    0 200 400 600 800 10001.50

    1.75

    2.00

    2.25

    2.50

    2.75

    3.00

    3.25σ V σ W

    0 50 100 150 200

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Figure: Autocorrelation of{

    σ(i )V

    }and

    {σ(i )W

    }of the MH sampler for various N.

    (Cumberland Lodge ) February 2016 15 / 25

  • Online problems

    Assume the parameters θ are known for now.

    We wish to calculate a best guess of where the state currently is.

    The Filtering problem:p(xt | y1:t ; θ),

    for each t = 1, ...,T

    Application is clear for tracking.

    (Cumberland Lodge ) February 2016 16 / 25

  • Online problems: finance

    Dynamic portfolio allocation

    Suppose yt ∼ N (µy , σ2t ) and a risk free asset with return rf , µy > rf .Portfolio return is rt = ωyt + (1−ω)rf , 0 ≤ ω ≤ 1.

    E [rt ] = ωµy + (1−ω)rfV [rt ] = ω2σ2t

    Wish to maximize some utility (with respect to ω):

    maxU(ω) = max{E [rt ]− κ2V [rt ]

    }ωt will vary with t, according to how σ2t changes.

    (Cumberland Lodge ) February 2016 17 / 25

  • Online problems: Kalman filter

    The conjugate solution is available as everything is linear and Gaussian

    Linear state space model

    yt = Zxt + εtxt = Txt−1 + ut

    .Filtering solution given by

    p(xt | y1:t ; θ) = N (xt | at ;Pt ).

    The filtered mean and variance has explicit solutions and at , Pt arefunctions of y1:tUnfortunately our problems are non-Gaussian/non-linear or both!

    (Cumberland Lodge ) February 2016 18 / 25

  • Particle Filter Estimation

    Simulation based methods to perform filtering in nonlinear/non-Gaussian state space models.

    See Gordon, Salmond and Smith (1993) (GSS), Kitagawa (1996),Pitt and Shephard (1999) and reviewed by Doucet et al. (2000).

    We aim to have ‘particles’, x1t , ....., xNt with associated discrete

    probability masses π1t , ....,πNt , drawn from the density f (xt |y1:t ) .

    Fast and recursive.

    (Cumberland Lodge ) February 2016 19 / 25

  • Online problems: particle filter

    We start at t = 0 with samples from xk0 ∼ p(x0).Algorithm 1: GSS for t=1,..,T:

    We have samples xkt ∼ p(xt |y1:t ) for k = 1, ...,N.1 For k = 1 : N, sample x̃kt+1 ∼ p(xt+1|xkt ).2 For k = 1 : N,

    πkt+1 =p(yt+1|x̃kt+1)

    ∑Ni=1 p(yt+1|x̃ it+1).

    3 For j = 1 : N, sample x jt+1 ∼ ∑Nk=1 πkt+1δ(xjt+1 − x̃kt+1).

    Step 3 multinomial (or stratified) sampling (from the mixture).This will yield an approximate sample the desired posterior density,p(xt |y1:t ) as t varies.

    (Cumberland Lodge ) February 2016 20 / 25

  • Parameter estimation using likelihood function, via predictiondecomposition given by;

    log L(θ) = log p(y1,....,.yT |θ) =T

    ∑t=1log p(yt+1|θ; y1:t ).

    We need to estimate the function :

    p̂(y1:T |θ) = p̂(y1|θ)T−1∏t=1

    p̂(yt+1|y1:t ; θ),

    p̂(yt+1|θ; y1:t ) =1N

    N

    ∑i=1p(yt+1|x̃ it+1).

    where x̃ it+1 ∼ f (xt+1|y1:t ; θ), from step (2).Remarkably (just like in IS) the lik estimator p̂(y1:T |θ) is unbiassedfor p(y1:T |θ) regardless of N, Del Moral (04).

    (Cumberland Lodge ) February 2016 21 / 25

  • Example: SV (with leverage):

    yt = log St − log St−1 = exp(αt/2)εtαt+1 = µ+ ρ(αt − µ) + σut ,

    corr(εt , ut ) = ρ < 0.

    ut = ρεt +√(1− ρ2)ζt

    αt+1 = µ+ ρ(αt − µ) + σ{ρyt exp(−αt/2) +√(1− ρ2)ζt}

    Very non-linear state equation as a consequence.

    Apply to 1000 daily S&P returns (last 4 years).

    (Cumberland Lodge ) February 2016 22 / 25

  • Example: SV (with leverage): PF

    N=50 particles

    2 4 6 8 10 12 14 16 18-2

    0

    2 Retu rn s

    2 4 6 8 10 12 14 16 18

    1.0

    1.5

    2.0 Q u an tiles o f f iltered sd

    2 4 6 8 10 12 14 16 18

    1

    2

    3Filtered p o in ts sd

    (Cumberland Lodge ) February 2016 23 / 25

  • Example: SV (with leverage):

    0 100 200 300 400 500 600 700 800 900 1000

    -5.0

    -2.5

    0.0

    2.5

    5.0Returns

    0 100 200 300 400 500 600 700 800 900 1000

    1

    2

    3Qu antiles o f filtered sd

    (Cumberland Lodge ) February 2016 24 / 25

  • Summary

    Offl ine and online problems arise in all areas of Statistics (includingmachine Learning).

    Markov chain Monte Carlo offers a solution to offl ine problems

    Can be diffi cult to devise a chain which mixes quickly (for fast reliableinference)

    Particle methods are now widely used for online inference (includingcar localisation).

    Need to be devised so that they can work well with high dimensionaland very non-linear models.

    (Cumberland Lodge ) February 2016 25 / 25