57
Sparse Optimization Methods and Statistical Modeling with Applications to Finance Michael Ho Department of Mathematics University of California, Irvine March 25, 2016 Michael Ho Sparse Finance March 25, 2016 1 / 46

Sparse Optimization Methods and Statistical Modeling with …mtho1/DefenseCharts6.pdf · 2016. 3. 25. · Modern portfolio theory (MPT) ... Portfolio design is sensitive to modeling

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Sparse Optimization Methods and StatisticalModeling with Applications to Finance

    Michael Ho

    Department of MathematicsUniversity of California, Irvine

    March 25, 2016

    Michael Ho Sparse Finance March 25, 2016 1 / 46

  • Outline

    1 Introduction and ContributionsMean-Variance Portfolio SelectionResearch Contribution

    2 Pairwise Weighted Elastic Net

    3 Covariance estimation from High Frequency Data

    4 Conclusion

    Michael Ho Sparse Finance March 25, 2016 2 / 46

  • Introduction and Contributions

    Section 1

    Introduction and Contributions

    Michael Ho Sparse Finance March 25, 2016 3 / 46

  • Introduction and Contributions Mean-Variance Portfolio Selection

    Outline

    1 Introduction and ContributionsMean-Variance Portfolio SelectionResearch Contribution

    2 Pairwise Weighted Elastic Net

    3 Covariance estimation from High Frequency Data

    4 Conclusion

    Michael Ho Sparse Finance March 25, 2016 4 / 46

  • Introduction and Contributions Mean-Variance Portfolio Selection

    Modern Portfolio Theory

    Modern portfolio theory (MPT) considers the following questionSuppose an investor needs to invest in a portfolio of assetsHow should the investor choose the portfolio?

    To answer this question MPT makes the following assumptionsInvestors make decisions based only on expected return and riskGiven two portfolios with the same expected return, an investorwill choose the lower risk portfolio

    Michael Ho Sparse Finance March 25, 2016 5 / 46

  • Introduction and Contributions Mean-Variance Portfolio Selection

    Mean - Variance Criteria can be formulated as a quadraticprogram

    Suppose there are N risky (random return) assetsDenote the single period return of the nth asset as rnThen a Mean-Variance optimal portfolio w can be written as thesolution to following quadratic program (QP)

    minw

    wT Γw

    s.t. wTEr ≥ η ≥ 0

    wT~1 = const (MV)

    where Γ is the covariance matrix of r .Here we assume Er 6= 0 and Γ is positive definiteThe above problem is convex and there are many techniques forsolving (MV)

    Michael Ho Sparse Finance March 25, 2016 6 / 46

  • Introduction and Contributions Mean-Variance Portfolio Selection

    Sharpe ratio optimal portolio

    If rF is the return of a risk-free asset, the excess return of the riskyassets is defined as r − rFThe Sharpe ratio (SR) optimal portfolio of risky assets can be computedvia

    maxw

    wTµ√wT Γw

    s.t. w 6= 0

    where µ is the mean of r − rFSince SR is invariant to positive scaling this can be reformulated (up to aconstant scaling) as

    minw

    wT Γw − wTµ

    SR optimal portfolio coincides with risky component of mean-varianceoptimal portfolio

    Michael Ho Sparse Finance March 25, 2016 7 / 46

  • Introduction and Contributions Mean-Variance Portfolio Selection

    Mean-variance criteria is subject to parameter uncertainty

    Implementation of mean-variance criteria is impeded by lack ofinformation

    Mean and covariance are unknown

    Intuitive work around is to estimate mean and covariance usingsample averages from past return data and plug-in into theoriginal MV problem

    minw

    wT Γ̂w − wT µ̂

    Applied to the stock market out-of-sample portfolio performanceusing this technique is poor

    Noisy dataNon-stationary statisticIll-conditioned covariance matrix ( high sensitivity to errors)

    Michael Ho Sparse Finance March 25, 2016 8 / 46

  • Introduction and Contributions Research Contribution

    Outline

    1 Introduction and ContributionsMean-Variance Portfolio SelectionResearch Contribution

    2 Pairwise Weighted Elastic Net

    3 Covariance estimation from High Frequency Data

    4 Conclusion

    Michael Ho Sparse Finance March 25, 2016 9 / 46

  • Introduction and Contributions Research Contribution

    Overview

    Research investigates two aspects of mean-variance portfolios

    Robustness of mean-variance criterion to modeling errors

    Portfolio design is sensitive to modeling and parameterassumptionsPerformance can be severely degraded when incorrectassumptions are made

    Parameter estimationParameters such as mean and variance needed for many portfolioselection criteriaParameters are often unknown but can be estimated fromhistorical dataAccurate estimation is essential to achieving robust performance

    Michael Ho Sparse Finance March 25, 2016 10 / 46

  • Introduction and Contributions Research Contribution

    Contributions of Dissertation

    1. Weighted elastic net penalized criterion

    Penalization approach that improves portfolio performance underparameter uncertaintyMaterial presented during candidacy examination (Nov 2014)Method improves on other techniques proposed in literatureSIAM J. Financial Math. (with J. Xin, Z. Sun), Vol. 6 2015

    2. Robust covariance estimation from high frequency dataAddresses market microstructure noise, asynchronous trading,jumpsSparse modeling approach (`1, Spike and Slab) adds robustnessto jumpsMethod outperforms simpler techniques proposed in literature

    Michael Ho Sparse Finance March 25, 2016 11 / 46

  • Pairwise Weighted Elastic Net

    Section 2

    Pairwise Weighted Elastic Net

    Michael Ho Sparse Finance March 25, 2016 12 / 46

  • Pairwise Weighted Elastic Net

    Pairwise Weighted Elastic Net

    To address parameter uncertainty the following is proposed

    Pairwise weighted elastic net (PWEN) penalized criterion

    minw

    wT Γ̂w − wT µ̂+ |w |T ∆|w |+ ||w ||~β,`1

    ∆ is positive semidefinite matrix with non-negative entries, βnon-negative||w ||β,`1 =

    ∑i |wi |βi

    Weighted elastic net penalty when ∆ is diagonal

    Michael Ho Sparse Finance March 25, 2016 13 / 46

  • Pairwise Weighted Elastic Net

    PWEN promotes robustness

    TheoremPWEN criterion equivalent to a robust optimization problem

    minw

    maxR∈A,v∈B

    wT Rw − vT w .

    A and B are parameter uncertainty sets for covariance and mean

    A ={

    R : Ri,j = Γ̂i,j + ei,j ; |ei,j | ≤ ∆i,j ; R � 0}

    B = {v : vi = µ̂i + ci ; |ci | ≤ βi} .

    ∆ is assumed to be diagonally dominate

    PWEN criterion optimizes worse case performance

    Michael Ho Sparse Finance March 25, 2016 14 / 46

  • Pairwise Weighted Elastic Net

    Calibration of PWEN

    Calibration of PWEN can be done by selecting an appropriateuncertainty set for parameter estimateBootstrapping is one way to quantify uncertainty

    Robust optimization interpretation used in calibration

    Michael Ho Sparse Finance March 25, 2016 15 / 46

  • Pairwise Weighted Elastic Net

    Performance Plot

    Performance benefit of PWEN and WEN demonstrated on U.S.stock return data630 stocks, from January 1,2001 to July 1, 2014, Mid to Large Cap

    Michael Ho Sparse Finance March 25, 2016 16 / 46

  • Covariance estimation from High Frequency Data

    Section 3

    Covariance estimation from High Frequency Data

    Michael Ho Sparse Finance March 25, 2016 17 / 46

  • Covariance estimation from High Frequency Data

    Large-Dimensional Covariance Estimation

    Covariance estimation of asset returns is an important step inportfolio optimizationMore training data can improve covariance matrix estimation ....however,Time varying nature of asset return statistics place limits on thetime interval where training data is relevant

    Figure: Time varying volatility limits amount of relevant data

    Michael Ho Sparse Finance March 25, 2016 18 / 46

  • Covariance estimation from High Frequency Data

    Exploiting High Frequency Data

    High-frequency data allows for more data in shorter time intervalCan obtain covariance estimates using more recent dataHowever,estimation of covariance from high-frequency data iscomplicated by

    Asynchronous returnsMarket Microstructure NoiseJumps

    Benefits of High frequency data complicated bynoise,asynchronous trading and jumps

    Michael Ho Sparse Finance March 25, 2016 19 / 46

  • Covariance estimation from High Frequency Data

    Asynchronous trading

    Standard sample average estimation of covariation of returnsrequires returns of all assets are sampled on a common gridIn high frequency data assets trade asynchronouslyResampling the data to a common grid can be performed but doesnot use all the data or may cause covariance to be non-positivedefinite

    Michael Ho Sparse Finance March 25, 2016 20 / 46

  • Covariance estimation from High Frequency Data

    Market Microstructure Noise

    Market friction such as bid ask spread is a source of noiseTrue efficient price is not observedOver short time periods price variation due to bid/ask spread canmask “true” efficient return

    lim∆→0

    T/∆∑n=0

    (Pnoise(∆(n + 1))− Pnoise(∆n))2 =∞

    Michael Ho Sparse Finance March 25, 2016 21 / 46

  • Covariance estimation from High Frequency Data

    Jumps in price can corrupt estimate of covariance

    Jumps in market returns not explained by a diffusion can occurThese jumps can severely bias the covariance estimate of thediffusion component of the returnsDisentangling price movement due to jumps and diffusioncomponents necessary to estimate covariance

    Michael Ho Sparse Finance March 25, 2016 22 / 46

  • Covariance estimation from High Frequency Data

    Data Model for Hidden Price Process

    Let Xn be a vector containing all log-prices at time n.Model discrete time log-price as

    Xn = Xn−1 + Vn︸︷︷︸N (D,Γ)

    + Jn︸︷︷︸Jump

    (1)

    Jn and Vn are i.i.d sequences and independentX is unobservedD and Γ unknown but assume known prior distribution.

    Michael Ho Sparse Finance March 25, 2016 23 / 46

  • Covariance estimation from High Frequency Data

    Observations are noisy and missing

    Observations are noisy (market micro-structure noise) andmissing

    Yn = ĨnXn︸︷︷︸subset of prices observed

    + Wn︸︷︷︸microstructure noise,N (0,Q)

    (2)

    V independent of J,W and XQ is unknown and diagonal but assume known prior distributionAssume observations are MAR(missing at random) and areindependent of prices

    Michael Ho Sparse Finance March 25, 2016 24 / 46

  • Covariance estimation from High Frequency Data

    Missing Data Example

    Single Asset

    Missing data can be inferred by nearby observations

    Multiple Assets

    Low rank structure in covariance can allow for improved

    inference of missing values

    Missing data can be inferred from observation of otherassets at same and different times

    Michael Ho Sparse Finance March 25, 2016 25 / 46

  • Covariance estimation from High Frequency Data

    Data Completion through Kalman smoothing

    Kalman smoothing can beused to infer missing data andremove noiseConditioned on parameters, θ,Kalman smoothing is arecursive method forcomputing the posteriordistribution p(x |y , θ)Applies only to normallydistributed data ( computesmean and variance)

    Rudolf Kalman, 2008

    Smoothing of noisy time series with missing data

    Michael Ho Sparse Finance March 25, 2016 26 / 46

  • Covariance estimation from High Frequency Data

    Jumps

    Kalman filter tends to over smooth jumps

    Jumps can contaminate estimate of covariance (further degradingKalman smoothing performance)Γ̂ ∼ Γ + E(JJT )︸ ︷︷ ︸

    jump bias

    Michael Ho Sparse Finance March 25, 2016 27 / 46

  • Covariance estimation from High Frequency Data

    Sparse Jump Models

    For discrete time modeling we consider two types of priordistributions for jump

    Spike and SlabLaplace Distribution

    Both priors induce sparsity in posterior mode of jumpsBoth models also popular for variable selection in regression andmachine learning

    Michael Ho Sparse Finance March 25, 2016 28 / 46

  • Covariance estimation from High Frequency Data

    Spike and Slab Jump Model

    For this model prior of Ji(t) is a mixture of point mass at 0 and anormal distribution

    p(ji(t)) = ζ 1ji (t)=0︸ ︷︷ ︸spike at 0

    +(1− ζ)N (ji(t),0, σ2j,i(t))︸ ︷︷ ︸slab

    ,

    Michael Ho Sparse Finance March 25, 2016 29 / 46

  • Covariance estimation from High Frequency Data

    Laplace Distribution

    Spike and slab distribution of J is non-continuous andmulti-modal, which complicates estimation of JAs an approximate we consider the Laplace distribution

    p(jn(t)) ∝ exp (−λn(t)|jn(t)|) (3)

    Induces weighted `1 norm in conditional log posteriorλn(t) treated as unknown with known distribution (gamma)Iterative estimation of λn(t) induces a reweighting of `1

    Michael Ho Sparse Finance March 25, 2016 30 / 46

  • Covariance estimation from High Frequency Data

    Laplace prior promotes sparse posterior mode

    Consider the following experiment

    Suppose κ is Laplacedistributed, q is N (0,1)Let observe η = κ+ q.Suppose we observeη = 0.5Maximum likelihoodestimate of κ is 0.5.Posterior mode is 0 !

    κ

    -6 -4 -2 0 2 4 60

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8Laplace Prior Promotes Sparse Posterior Mode

    LikelihoodLaplace PriorPosterior

    Laplace prior promotes spare posterior mode

    Michael Ho Sparse Finance March 25, 2016 31 / 46

  • Covariance estimation from High Frequency Data

    Maximum a posteriori (MAP) estimation of covariance

    MAP estimate of covariance, Γ is mode of posterior

    [Γ̂, θ̂] = arg maxΓ′,θ′

    log p(θ′, Γ′|y)

    where θ are the nuisance parameters (jumps,noise variance, etc)Posterior is difficult to directly optimize due to missing dataIterative approaches normally employed

    Michael Ho Sparse Finance March 25, 2016 32 / 46

  • Covariance estimation from High Frequency Data

    ECM approach to MAP estimation

    Expectation conditional maximization (ECM) algorithm(Meng,Rubin 1993) alternates between two steps

    E-step: Compute the following surrogate function

    G(k)([Γ, θ]) = EX |Y ,Γ(k),θ(k) log p(Γ, θ|y , x)

    M-step: Set [Γ̂(k+1), θ̂(k+1)] to conditional maximizers of G(k)([Γ, θ])

    E-step performed using Kalman Smoother (jumps compensatedfor using estimate from prior iteration)Monotonic increase in log posteriorAlgorithm converges to a local mode under mild regularityconditions which hold for this problem

    Michael Ho Sparse Finance March 25, 2016 33 / 46

  • Covariance estimation from High Frequency Data

    KECM-Laplace recovery, Low Rank Covariance

    (Movie Loading.avi)

    KECM approach can recover missing prices whencovariance is low rank

    Michael Ho Sparse Finance March 25, 2016 34 / 46

    LowRankVideo.wmvMedia File (video/x-ms-wmv)

  • Covariance estimation from High Frequency Data

    KECM-Laplace recovery, High Rank Covariance

    time510 520 530 540 550 560 570 580

    pric

    e

    35

    35.05

    35.1

    35.15

    35.2

    35.25

    35.3

    35.35

    35.4

    Posterior MeanObservationTruth

    Price recovery more difficult when covariance is highrank

    Michael Ho Sparse Finance March 25, 2016 35 / 46

  • Covariance estimation from High Frequency Data

    KECM-Laplace recovery with Jump

    (Movie Loading.avi)

    KECM-Laplace

    Michael Ho Sparse Finance March 25, 2016 36 / 46

    JumpVideo2.wmvMedia File (video/x-ms-wmv)

  • Covariance estimation from High Frequency Data

    Bayesian Approach using MCMC

    Problems with ECMapproach

    Reports single modeNuisance parametersestimatedUncertainty notreflected in mode

    Bayesian ApproachPosterior distributiondeterminedNuisance parametersintegrated out

    Moderate jumps: Single Modeposterior

    Small jumps: Multimodal posteriorMichael Ho Sparse Finance March 25, 2016 37 / 46

  • Covariance estimation from High Frequency Data

    Gibbs sampling approximation to posterior

    Computing posterior of covariance directly involves integrationover a high-dimensional parameter spaceMarkov Chain Monte Carlo (MCMC) approaches such as Gibbssampling can be used to approximate the posterior in an efficientmanner

    Sequentially draw each parameter from it’s conditional posteriordistributionSequence converges in distribution to posterior (under someconditions)

    For this model Gibbs sampling is convenient since eachconditional distribution is easy to draw from

    Michael Ho Sparse Finance March 25, 2016 38 / 46

  • Covariance estimation from High Frequency Data

    MCMC Example

    (Movie Loading.avi)

    MCMC captures uncertainty in parameters

    Michael Ho Sparse Finance March 25, 2016 39 / 46

    mcmc_smp.wmvMedia File (video/x-ms-wmv)

  • Covariance estimation from High Frequency Data

    MCMC Movie

    (Movie Loading.avi)

    MCMC escapes from local mode

    Michael Ho Sparse Finance March 25, 2016 40 / 46

    MCMCvideo.wmvMedia File (video/x-ms-wmv)

  • Covariance estimation from High Frequency Data

    Results of Covariance Estimation

    Characterize performance using normalized Frobenius norm of error√∑i,j |Γi,j − Γ̂i,j |2√∑

    i,j |Γi,j |2.

    Relative covariance estimation error for various jump size and frequency.Michael Ho Sparse Finance March 25, 2016 41 / 46

  • Covariance estimation from High Frequency Data

    Performance under GARCH(1,1)-jump model

    Xi(t) = Xi(t − 1) +√

    hiVi(t) + Ji(t)Zi(t) + D

    hi(t + 1) = bihi(t) + ai(Xi(t)− Xi(t − 1)− D)2 + ci

    Relative covariance estimation error for various jump size and frequency.

    Michael Ho Sparse Finance March 25, 2016 42 / 46

  • Covariance estimation from High Frequency Data

    Performance with stochastic noise variance

    Here we extend GARCH(1,1) model to stochastic microstructure noisevariance

    σ2o,i(t) = a2(Xi(t)− Xi(t − 1)− D)2 + b2

    Relative covariance estimation error for various jump size and frequency.

    Michael Ho Sparse Finance March 25, 2016 43 / 46

  • Conclusion

    Section 4

    Conclusion

    Michael Ho Sparse Finance March 25, 2016 44 / 46

  • Conclusion

    Conclusion

    Sparse modeling and optimization applied to finance in 2 waysPortfolio robustness enhancements

    This dissertation has considered the application of sparseoptimization and modeling to financeWeighted and Pairwise Weighted Elastic Net penalized portfolioshown to improve robustness of portfolios using U.S. stock returndata

    Covariance estimation from high frequency dataKalman EM approach extended to models that include price jumpsNew approach shows enhanced performance under jump modelsfor a variety of simulated data models (Jumps, GARCH, dependentobservation noise)

    Michael Ho Sparse Finance March 25, 2016 45 / 46

  • Conclusion

    Future work

    Pairwise weighted elastic netFurther investigate calibration of pairwise weighted elasticRelaxing diagonal dominant restriction on weighting matrix, ∆, mayimprove performance

    Covariance estimation from high frequency dataFurther investigate low rank + sparse matrix factorizationtechniques to enhance covariance estimationReweighted nuclear norm and reweighted `1 penalties

    Michael Ho Sparse Finance March 25, 2016 46 / 46

  • Backup Charts

    Section 5

    Backup Charts

    Michael Ho Sparse Finance March 25, 2016 1 / 11

  • Backup Charts

    Solution via nuclear norm minimization

    Missing data can also be recovered using matrix completion by notingreturns are low rank

    DefinitionRi,t :unobserved low rank component return of asset i at time tJi,t :unobserved sparse jump component return of asset i at time tXi :unobserved efficient price of asset i at time 0Yik ,tk : observed (noisy) price of asset ik at time tk .S: discrete time integration ( in time) operator (rectangularmethod)

    Nuclear Norm Formulation

    minX ,J,R ||R||∗ + λ1∑

    k

    (Xik + ((R + J)S)ik ,tk − Yik ,tk

    )2+ λ2||J||`1

    Michael Ho Sparse Finance March 25, 2016 2 / 11

  • Backup Charts

    Example Reconstruction 80 percent observed, No Noise

    Time0 50 100 150 200 250

    log-

    Pric

    e

    4.095

    4.1

    4.105

    4.11

    4.115

    4.12

    4.125

    4.13

    Reconstructed log-price

    TruthNuclear Norm MinimizationKECM-LaplaceObservations

    Time100 110 120 130 140 150 160 170 180 190

    log-

    Pric

    e

    4.113

    4.1135

    4.114

    4.1145

    4.115

    Reconstructed log-price - Zoom In

    TruthNuclear Norm MinimizationKECM-LaplaceObservations

    Time0 50 100 150 200 250

    Jum

    p

    -0.02

    -0.015

    -0.01

    -0.005

    0

    0.005

    0.01

    0.015

    0.02Reconstructed Jump

    TruthNuclear Norm MinimizationKECM-Laplace

    Singular Value #0 2 4 6 8 10 12 14 16 18 20

    Sin

    gula

    r V

    alue

    10-20

    10-15

    10-10

    10-5

    100Singular Values of log Returns (jumps removed)

    TruthNuclear Norm MinimizationKECM-Laplace

    80 percent observed - No Noise

    Michael Ho Sparse Finance March 25, 2016 3 / 11

  • Backup Charts

    Example Reconstruction 30 percent observed, No Noise

    Time0 50 100 150 200 250

    log-

    Pric

    e

    4.16

    4.165

    4.17

    4.175

    4.18

    4.185

    4.19

    4.195

    Reconstructed log-price

    TruthNuclear Norm MinimizationKECM-LaplaceObservations

    Time80 90 100 110 120 130 140 150

    log-

    Pric

    e

    4.1765

    4.177

    4.1775

    Reconstructed log-price - Zoom In

    TruthNuclear Norm MinimizationKECM-LaplaceObservations

    Time0 50 100 150 200 250

    Jum

    p

    -0.01

    -0.005

    0

    0.005

    0.01Reconstructed Jump

    TruthNuclear Norm MinimizationKECM-Laplace

    Singular Value #0 2 4 6 8 10 12 14 16 18 20

    Sin

    gula

    r V

    alue

    10-8

    10-7

    10-6

    10-5

    10-4

    10-3

    10-2Singular Values of log Returns (jumps removed)

    TruthNuclear Norm MinimizationKECM-Laplace

    30 percent observed - No Noise

    Michael Ho Sparse Finance March 25, 2016 4 / 11

  • Backup Charts

    Example Reconstruction 80 percent observed, Noise

    Time0 50 100 150 200 250

    log-

    Pric

    e

    3.635

    3.64

    3.645

    3.65

    3.655

    3.66

    3.665

    3.67

    Reconstructed log-price

    TruthNuclear Norm MinimizationKECM-LaplaceObservations

    Time50 60 70 80 90 100 110 120 130 140

    log-

    Pric

    e

    3.65

    3.6505

    3.651

    3.6515

    3.652

    3.6525

    Reconstructed log-price - Zoom In

    TruthNuclear Norm MinimizationKECM-LaplaceObservations

    Time145 150 155 160 165 170 175

    Jum

    p

    ×10-3

    -1

    0

    1

    2

    3

    4

    5

    6

    Reconstructed Jump

    TruthNuclear Norm MinimizationKECM-Laplace

    Singular Value #0 2 4 6 8 10 12 14 16 18 20

    Sin

    gula

    r V

    alue

    10-20

    10-15

    10-10

    10-5

    100Singular Values of log Returns (jumps removed)

    TruthNuclear Norm MinimizationKECM-Laplace

    80 percent observed - Noise

    Michael Ho Sparse Finance March 25, 2016 5 / 11

  • Backup Charts

    Example Reconstruction 30 percent observed, Noise

    Time0 50 100 150 200 250

    log-

    Pric

    e

    3.595

    3.6

    3.605

    3.61

    3.615

    3.62

    3.625

    3.63Reconstructed log-price

    TruthNuclear Norm MinimizationKECM-LaplaceObservations

    Time70 80 90 100 110 120 130 140 150 160

    log-

    Pric

    e

    3.6096

    3.6098

    3.61

    3.6102

    3.6104

    3.6106

    3.6108

    3.611

    3.6112

    Reconstructed log-price - Zoom In

    TruthNuclear Norm MinimizationKECM-LaplaceObservations

    Time206 208 210 212 214 216 218

    Jum

    p

    ×10-3

    0

    5

    10

    15

    20

    Reconstructed Jump

    TruthNuclear Norm MinimizationKECM-Laplace

    Singular Value #0 2 4 6 8 10 12 14 16 18 20

    Sin

    gula

    r V

    alue

    10-20

    10-15

    10-10

    10-5

    100Singular Values of log Returns (jumps removed)

    TruthNuclear Norm MinimizationKECM-Laplace

    30 percent observed - Noise

    Michael Ho Sparse Finance March 25, 2016 6 / 11

  • Backup Charts

    ECM algorithm for Laplace jump model

    Initialize estimate of Γ, σ2, and Jwhile not converge

    Compute posterior distribution of the X given Y , Γ, σ2, J,D withKalman smoother(E-Step)Perform M-step for Γ,D and σ2, assume J is fixedCompute MAP estimate of J given Γ and σ2 using ADMM,FISTA,etc..Update λi (t) ( effectively reweights `1 penalty)

    Algorithm for spike and slab model is similar.

    Michael Ho Sparse Finance March 25, 2016 7 / 11

  • Backup Charts

    Gibbs sampling approach for spike and slab

    Initialize parameters Θ(0) = [Ymiss,X , Γ,D, J, σ2, ζ, σ2j ]

    for m = 0 . . .Mfor k = 1 . . . 8

    Sample Θ(m+k/8)k from p(Θk |Θ(m+(k−1)/8)−k )

    Discard first P samples “burn-in”Take covariance samples to estimate posterior mean ofcovariance

    Michael Ho Sparse Finance March 25, 2016 8 / 11

  • Backup Charts

    Example: Bootstrapping the uncertainty set when statistics areunknown

    Here we illustrate one way to calibrate the uncertainty set for µSuppose we have training data returns r(1), . . . , r(T )Randomly take T samples from {r(1), . . . , r(T )} (withreplacement)

    Call these ζ(1), . . . ζ(T )Use empirical distribution of µ̂(ζ(1), . . . , ζ(T ))− µ̂(r(1), . . . , r(T ))as proxy for estimation error

    This can be done via Monte Carlo by resampling many times

    β can be selected as a percentile of the empirical distribution

    Michael Ho Sparse Finance March 25, 2016 9 / 11

  • Backup Charts

    Sample Average Plug-in Performance is Disappointing

    Consider the following experiment

    Return data collectedfrom 20 US stocksbetween 7-2001 and7-2013Sharpe Ratio OptimalPortfolio Computedbased on 55 days oftraining dataPortfolio performanceevaluated using next 30trading days Performance of plug-in mean-variance portfolio is disappointing

    Michael Ho Sparse Finance March 25, 2016 10 / 11

  • Backup Charts

    Bootstrap versus Normal-χ2 Approximation Calibration

    Calibration using bootstrap Calibration using Normal-χ2 approximation

    Michael Ho Sparse Finance March 25, 2016 11 / 11

    Introduction and ContributionsMean-Variance Portfolio SelectionResearch Contribution

    Pairwise Weighted Elastic NetCovariance estimation from High Frequency DataConclusionAppendix