Recipe For Financial Computing in Data-RichWorld1
Binghuan Lin
Was: Techila Technologies LtdNow: Risk Methodology at UBS
March 15, 2016
1Disclaimer: The opinions, ideas and approaches expressed or presented arethose of the author and do not necessarily reflect UBS’s position. As a result,UBS cannot be held responsible for them.
Agenda
Why cloud computing and why now?
Ingredients
Cost and comparison
Why cloud computing and why now? I
An increasing computing demand from Finance industry:
Figure 1: HPC Spending By Sector 2013 V.S. 2018, Data Source:IDC2014
I Economic/Finance: 8.7% yearly increase from 2013 to 2018.
Why cloud computing and why now? II
A long history of co-development of FE & computing technology:
1900s 1940s 1950s 1960s 1970s 1980s 1990s 2000 2000s 2006 2012 2015
1900Theory of speculations – Louis. Bachelier
1952Portfolio selection – Harry Markowitz
1973Black-Scholes-MertonF.Black, M.ScholesR.Merton
1990-2000Stochastic Volatility Model and Local Volatility Model
2000-Jump diffusion model, Levy model, etc
Large-scale mainframe computers
Time-sharing service, IBM VM Operating System
Virtual private network (VPN)
2006,Amazon introduce Elastic Compute Cloud
1900s
2008, Microsoft Azure
2013 - Basel III
Figure 2: History of Financial Engineering and Cloud Computing
Why cloud computing and why now? III
Increasing awareness of cloud:
Figure 3: Google search trend of cloud computing (since 2004)
Challenges
I How to ensure data/model consistency: Model users,developers, reviewers, validators, etc
I How to allocate resource to different users with differentpriorities smartly?
I How to port legacy code to new computing environment?
I How to protect sensitive data?
Taxonomy of parallel computing I
(a) Embarrassingly parallelcomputing
(b) Non-embarrassingly parallel com-puting
Figure 4: Parallel Computing Structure
Data-parallel Sub-data sets distributed to different processors;
Task-parallel Sub-tasks distributed to different processors;
Scalable Either a scalable problem size or scalable parallelism;
Memory bottleneck
Figure 5: Three types of memory consumption
1. Input data are large: order book data, news feeds, etc;
2. Resulting data are large: Counterparties’ simulated defaultsover a period, etc;
3. Large data generated during computing: Monte Carloscenarios, etc;
How to store/process data?
1. Numeric methods/tricks: seeding, approximation, etc;
2. Add more resource;
3. Cloud: Azure, AWS, Spark, Techila, etc;
Architecture
Figure 6: Techila High Level Architecture
Spark
(a) SLURM (b) Spark
Figure 7: SLURM V.S. Spark
Distributed algorithms I: Particle filter
Figure 8: Three types of memory consumption
Distributed algorithm II: distributed optimizer
Large scale mean-variance optimisation via ADMM
wk+11 := argminw ((1/2)||Rw − B||22 + (ρ/2)||w − zk + µk1 ||22)(1)
wk+12 := ΠC (z − µ2) (2)
zk+1 := 1/2(Sλ/ρ(wk+11 + µk) + Sλ/ρ(wk+1
2 + µk)) (3)
µk+11 := µk1 + wk+1
1 − zk+1 (4)
µk+12 := µk2 + wk+1
2 − zk+1 (5)
Cloud V.S. In-house solution
Cloud In-house
On-demand computing Yes NoFlexible cost Yes NoClient service Yes goto/complainNegotiation Yes goto/complain
Why cloud solution might be cheaper?
There are similarity between financial portfolio management and ITportfolio management in cost reduction and risk control:
I Pooling investment/resource to reduce cost;
I Diversification;
I Let the professionals do the work;
Choice of vendor I
Figure 9: Deployment time, Source: Techila
Choice of vendor II
Figure 10: Price V.S. Performance, Source: Techila