Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.1
Self-Optimizing Caches
Irfan AhmadCachePhysics
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.2
About
Irfan AhmadCachePhysics CofounderCloudPhysics Cofounder
VMware (Kernel, Resource Management),Transmeta,30+ PatentsUniversity of Waterloo
@virtualirfan
CachePhysics
Data Path Monitoring and Modeling SoftwareReal-time Predictive Modeling of Data Access Patterns
Increasing Performance & Cost Efficiency of Existing CachesPowering Next-Generation Self-Learning Caches
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.3
Latency versus Cost
•Which apps need low latency?•DRAM high $$ (10x v Flash)•Where is the bottleneck?
Persistent MemoryFLASHHDD
LatencySLAs(ms)
Cost Sensitive$$$$ $$
HFT
Ecommerce
Real-Time Decisions
Analytics
Decision Support
Ad Tech
DRAM
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.4
Caches are Critical to Every Applications
Distributed Platform
DBsKV Stores
ContainersApps
ServersCPUs
L1/L2/LLCDRAM
3D XpointFlash
NetworkStorage
…
Network Elastic Cache
Cache PerformanceHit RatioCache Size
65%128GB
Intelligent Cache Management is Non-Existent
• Is this performance good?• Can performance be improved?• How much Cache for App A vs B vs …?• What happens if I add / remove DRAM?• How much DRAM versus Flash?• How to achieve 99%ile latency of X µs?• What if I add / remove workloads?• Is there cache thrashing / pollution?• What if I change cache parameters?
Yet…
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.5
Static vs. Self-Optimizing Caches
Applications have dramatically differentoptimal cache size and parameter settings.
Today: parameters chosen based on benchmarking.One size does not fit all.
Static Caches Vulnerable to:• Thrashing, Scan pollution• Gross unfairness, Interference• Unpredictability
⇒ Overprovisioning⇒ Lack of Control
Parameter 1
Parameter 2
Parameter 3
Each point represents optimal cache settingsfor a single application
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.6
Learn performance model of applications and cache
Predict the performance of workload as f(cache size, params)
Modeling Performance in Real-TimeLower is better
42 84 128 1700
Cache Size (GB)
Late
ncy
(ms)
Cache PerformanceHit RatioCache Size
65%128GB
12
3
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.7
Understanding Cache ModelsLower is better
42 84 128 1700
Cache Size (GB)
Late
ncy
(ms)
12
3
x2 x4 x8 Models help decide useful increments of change.
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.8
Understanding Cache Models (2)Lower is better
42 84 128 1700
Cache Size (GB)
Late
ncy
(ms)
12
3
Often, most operating points are highly inefficient.
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.9
Understanding Model-based Adaptation
Single Workload. Prediction of performance under different policies.
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.10
Sample Models From Production Workloads
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.11
LIRS Adaptation Examples
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.12
2Q Adaptation Examples
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.13
Cliff Removal: New Class of Acceleration
Steer the curve? Interpolate convex hull Need Model (HPCA ’15)
Shadow partitions 𝛼𝛼, 𝛽𝛽 Steer different fractions
of refs to each Emulate cache sizes on
convex hull via hashing
𝛼𝛼
𝛽𝛽
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.14
Cliff Reduction Results
69%of potential
gain
48% 38%
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.15
Achieving Latency Targets
LatencyTarget (7 ms)
CacheAllocation(>16 GB)
Client target 95th %ile latency is 7 ms
Autoset cache partitions size to 16GB to guarantee avg latency SLOs* Throughput targets
can be implemented similarly
Cache Size (GB)
95th
%ile
Late
ncy
(ms)
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.16
Multi-Tier Sizing
* Can model network bandwidth as a function of cache misses from each tier
Tier 0 (DRAM) allocationTier 1 (3D Xpoint)
Network Misses
Tier 2 (Local Flash)Tier 3 (Remote Flash)
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.17
Towards a Self-Optimizing Data Path
Remote Tier
Monitoring Auto-Select Policies(dynamic parameters)
Lower is better
42 84 128 1700
Cache Size (GB)
Mis
s R
atio
0.0
0.2
0.4
0.6
0.8
Latency Reduction(Thrashing Remediation)
Latency Guarantees Accurate Tiering
Tier 0 allocation for this clientTier 1 allocation for this clientLatency
Target (7 ms)
CacheAllocation(>16 GB)
Client target IO latency is: 7 msGuarantee avg latency: autosetcache partition to 16GB
Multi-Tenant Isolation
vs
Client 0 Client 1
Part. 0 Part. 1
Results:• Safely quantify
impact of changes
• Often 50-150% cache efficiency improvements ($$)
• Latency SLAs met
• Fewer production fire fights
• Higher consolidation ratios
• Accurate Capacity Planning
2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.18
[email protected] 650-417-8559 @virtualirfan
CachePhysics