39
Towards Intelligent Storage Systems Limitless Storage Limitless Possibilities https://hps.vi4io.org Image under free license (CC0) Computer Science Department Copyright University of Reading 2018-05-08 LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT Julian M. Kunkel Computer Science Workshop S H

Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Towards IntelligentStorageSystems

Limitless StorageLimitless Possibilities

https://hps.vi4io.org

Image under free license (CC0)

Computer Science Department

Copyright University of Reading

2018-05-08

LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT

Julian M. Kunkel

Computer Science Workshop

SH

)

Page 2: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

1 Introduction

2 Limitations

3 Vision

4 Smart Interfaces

5 Community Strategy

6 Summary

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 2 / 28

Page 3: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

High-Performance Storage

� I/O intense science requires high I/O performance

� The file systems of the data center DKRZ offer about 700 GiB/s throughput

I However, I/O operations are typically inefficient: Achieving 10% peak is goodI Unfortunately, prediction of performance is barely possible

� Various factors influence I/O performance

I Application’s access pattern and usage of storage interfacesI Technology / HardwareI Concurrent activity – shared nature of I/OI Tunable optimizations deal with characteristics of storage mediaI Complex interactions of these factors

� The I/O hardware/software stack is very complex – even for experts

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 3 / 28

Page 4: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Opportunities

� Chances for tools and method development for:

I Particularly: Performance models and machine learningI Predicting performance, identification of slow performanceI Diagnosing causesI Prescribing tunables/settings

� Chance to develop more suitable interfaces!

I Avoiding pitfalls of current interfacesI Harnessing smart data management

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 4 / 28

Page 5: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Outline

1 Introduction

2 Limitations

3 Vision

4 Smart Interfaces

5 Community Strategy

6 Summary

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 5 / 28

Page 6: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Peeking at the Current I/O Stack – System Perspective

� Coexistence of access paradigms

I File (POSIX, ADIOS, HDF5), SQL, NoSQL

� Semantical information is lost through layers

I Suboptimal performance

� Reimplementation of features across stack

I Unpredictable interactionsI Wasted ressources

� Restricted (performance) portability

I Optimizing each layer for each system?

Figure: Example I/O stack

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 6 / 28

Page 7: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Limitations of the current software stackPlatform

1 Zoo of interfaces

2 Low-level storage APIs

3 Loss of semantical information

4 Interference of applications / lack of coordination

5 All data treated identically

Software

1 Explicit workflows

2 Unclear access patterns (users, sites)

3 No performance awareness, manual tuning needed

4 Users lack technological knowledge for tweaking

5 Manual tiering (or with file-grained policies)Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 7 / 28

Page 8: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Zoo of InterfacesMultitude of data models

� POSIX File: Array of bytes� HDF5: Container like a file system

I Dataset: N-D array of a (derived) datatypeI Rich metadata, different APIs (tables)

� Database: structured (+arrays)� NoSQL: document, key-value, graph, tuple

Choosing the right interface is difficult – a workflow may involve several

Properties / qualities

� Namespace: Hierarchical, flat, relational� Access: Imperative, declarative, implicit (mmap())� Concurrency: Blocking vs. non-blocking� Consistency semantics: Visibility and durability of modifications

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 8 / 28

Page 9: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Low Level Data Access

Applications work with (semi)structured data

� Vectors, matrices, n-Dimensional data

A file is just a sequence of bytes!

...File

offset

Applications/Programmers must serialize data into a flat namespace

� Uneasy handling of complex data types

� Mapping is performance-critical (on HDDs)

� Vertical data access unpractical (e.g., to to pick a slice of multiple files)

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 9 / 28

Page 10: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Loss of Semantical InformationInformation hidden from file systems

� Data types

� Data semantics

� Value of data

� Type: Checkpoint, computed, original, logfile

� Data lifecycle: production, usage, deletion

Characteristics can even vary within a file, e.g. for metadata

Storage systems could use this information for

� Improving performance: Automatic tiering, caching, replication

� Simplifying management: ILM, offering alternative data views

� Correctness: Ensuring data consistency

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 10 / 28

Page 11: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Manual Tuning

� There are many options to tweak the I/O-stack

I API: posix_fadvise(), HDF5 properties, open flags, cache sizeI Via command line: lfs setstripeI Setup/initialization of a storage systemI Mounting options and procfs settings

� Many options are of technical nature

I Performance gain/loss depend on hardware, softwareI Specific to file system, API (MPI, POSIX, HDF5)I Many types of hints/tweaks are not portable

� Performance loss forces us to use these optimization

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 11 / 28

Page 12: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Critical Discussion

Questions from the users’ perspective

� Why do I have to organize the file format?

I It’s like taking care of the memory layout of C-structs

� Why do I have to convert data between storage paradigms?� Why must I provide system specific performance hints?

I It’s like telling the compiler to unroll a loop exactly 4 times

� Why is a file system not offering the consistency model I need?

I My application knows the required level of synchronization

� Why can’t I rely on a correct implementation of the consistency model?

I Parallel file systems have performance issues with most models

Being a user, I would rather code an application?

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 12 / 28

Page 13: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Outline

1 Introduction

2 Limitations

3 Vision

4 Smart Interfaces

5 Community Strategy

6 Summary

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 13 / 28

Page 14: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Personal Vision: Intelligent Storage Systems

Access paradigmDatabase File system

Local storage

ILM/HSM Self-awarenessSystem characteristics

NoSQL HDF5

Topology awareHierarchical storage

Performance model

Data replication

Semi-structured data

Content aware

Semantical access

Data transformation

Dynamic “on-disk” format

Intelligence Smart

Natural storage accessData exploration

Semantical name space Guided interface

Programmability

Data mining

Application focus User

Storage system

Arbitrary views

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 14 / 28

Page 15: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Semantic Interfaces

Guiding vs. automatism vs. technical hints

� Users provide semantic information to guide an intelligent system.� The I/O stack may exploit this information or not.� Systems could improve (via ML) over time by using the information better.

Semantic information which could be provided by users

� Data types� Metadata like meaning� Relations between data� Lifecycle (especially usage)

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 15 / 28

Page 16: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Coexistence of Storage Systems vs. Tiering

HDDHDD

Node

Memory

Node

Memory

NVRAM

SSD Tape

S3

CloudEC2

HDDHDD HDDMemory HDDBurst Buffer SSD

...

� We shall be able to use all storage technologies concurrentlyI Without explicit migration etc. put data where it fitsI Administrators just add a new technology (e.g., SSD pool) and users benefitI Should be steered by a standard and open interfaceI Open ecosystem for any vendor...

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 16 / 28

Page 17: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Additional Responsibilities of Storage System

� Mapping of data structures

� Flexible semantics

� Compute offloading, see success of big data tools

� Tight integration of workflows

� Advanced performance assessment

� Namespace based on metadata

� Management tools

� ...

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 17 / 28

Page 18: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Outline

1 Introduction

2 Limitations

3 Vision

4 Smart Interfaces

5 Community Strategy

6 Summary

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 18 / 28

Page 19: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Compression Research: Involvement

� Scientific Compression Library (SCIL)I Separates concern of data accuracy and choice of algorithmsI Users specify necessary accuracy and performance parametersI Metacompression library makes the choice of algorithmsI Supports also new algorithmsI Ongoing: standardization of useful compression quantities

� Development of algorithms for lossless compressionI MAFISC: suite of preconditioners for HDF5, pack data optimally, reduces

climate data by additional 10-20%, simple filters are sufficient

� Cost-benefit analysis: e.g., for long-term storage MAFISC pays of� Analysis of compression characteristics for earth-science related data sets

I Lossless LZMA yields best ratio but is very slow, LZ4fast outperforms BLOSCI Lossy: GRIB+JPEG2000 vs. MAFSISC and proprietary software

� Method for system-wide determination of ratio/performanceI Script suite to scan data centers...

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 19 / 28

Page 20: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

SCIL: Supported User-Space Quantities

Quantities defining the residual (error):absolute tolerance: compressed can become true value ± absolute tolerance

relative tolerance: percentage the compressed value can deviate from true value

relative error finest tolerance: value defining the abs tol error for rel compression for values around 0

significant digits: number of significant decimal digits

significant bits: number of significant decimals in bits

field conservation: limits the sum (mean) of field’s change

Quantities defining the performance behavior:compression throughput

decompression throughput

in MiB or GiB, or relative to network or storage speed

Aim to standardize user-space quantities across compressors!See https://www.vi4io.org/std/compression

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 20 / 28

Page 21: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

SCIL Provides Typical Synthetic Data

Example: Simplex (options 206, 2D: 100x100 points)

Right picture compressed with Sigbits 3bits (ratio 11.3:1)

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 21 / 28

Page 22: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Ongoing Activity: Earth-Science Data Middleware

Part of the ESiWACE Center of Excellence in H2020

Design Goals of the Earth System Data Middleware

1 Understand application data structures and scientific metadata

2 Flexible mapping of data to multiple storage backends

3 Placement based on site-configuration + performance model

4 Site-specific optimized data layout schemes

5 Relaxed access semantics, tailored to scientific data generation

6 A configurable namespace based on scientific metadata

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 22 / 28

Page 23: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Architecture

Tools and services

ESD

Application1

NetCDF4 (patched)

Application2 Application3

GRIB

HDF5 VOL (unmodified)

ESD interface

cp-esd esd-daemonesd-FUSE

Layout Datatypes

ESD (Plugin)

Performance model

Metadata backend Storage backends

Site configuration

RDBMSNoSQL POSIX-IO Object storage Lustre

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 23 / 28

Page 24: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Outline

1 Introduction

2 Limitations

3 Vision

4 Smart Interfaces

5 Community Strategy

6 Summary

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 24 / 28

Page 25: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

A Potential Approach in the Community: Following MPI

� The standardization of a high-level data model & interface

I Targeting data intensive and HPC workloadsI Lifting semantic access to a new level

� Development of a reference implementation of a smart runtime system

I Implementing key features

� Demonstration of benefits on socially relevant data-intense apps

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 25 / 28

Page 26: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Development of the Data Model and API

� Establishing a Forum similarly to MPI

� Define data model for HPC

I To have a future: must be beneficial for Big Data + Desktop, too

� Open board: encourage community collaborationS

tand

ard

1.0

Data Model

Interface

Reference Implementation

Standard-Forum

Steering Board

Committee

WorkgroupUse

-cas

es

Mini-apps

Workflows

Industry

Data centers

Scientists

Pseudo code

Mem

ber

s

Bod

ies

Nex

t Gen

erat

ion

Sta

ndar

d 2.

0

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 26 / 28

Page 27: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

API Key Features

� High-level data model for HPCI Storage understands data structures vs. byte arrayI Relaxed consistency

� Semantic namespaceI Organize based on domain-specific metadata (instead of file system)I Support domain-specific opperations and addressing schemes

� Integrated processing capabilitiesI Offload data-intensive compute to storage systemI In-situ/In-transit workflows

� Workflow managementI Managed data-driven workflow

� Performance-portabilityI Guided interfaces: Intents vs. technical hints

� Enhanced data management featuresI Embedded performance analysisI Resilience, import/export, ...

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 27 / 28

Page 28: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Introduction Limitations Vision Smart Interfaces Community Strategy Summary

Summary

� Parallel I/O is complex

I System complexity and heterogeneity increases significantlyI HPC users (scientists) and data centers need methods and tools

� Abstract semantic interfaces combined with ML will be a game changer

� Intelligent systems will increase insight and reduce the burden for users

� I will attempt a community strategy

� High performance storage web page: https://hps.vi4io.org

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 28 / 28

Page 29: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Appendix

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 29 / 28

Page 30: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

High Performance Storage

� Efficient I/O

I Performance analysis methods, tools and benchmarksI Modeling of performance and costsI Optimizing parallel file systems and middlewareI Tuning of I/O: Prescribing settingsI Management of workflowsI Intelligent storage

� Interfaces: towards domain-specific solutions

� Data reduction: compression library, algorithms, methods, re-computation

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 30 / 28

Page 31: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Other Research Interests

Scientific and High Performance Computing

� Big Data Analytics

� Domain-specific languages

� Cost-efficiency and data centers

� Scientific programming

� Efficient teaching

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 31 / 28

Page 32: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Data Model

attr ibuting

Information Concepts

Storage & Processing Concepts

Fragment

Storage System

SchemaRegistry

CoverageVar

Collection

Schema Metadata

ContainerDataset

partit ionedintousing schema

1

1..*

lives on

references0..* 0..*

stored i n

describedb y

contains

contains1

is a

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 32 / 28

Page 33: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Processing Model

runs

Job script

Cluster workload manager

Traditional cluster management

dispatchesadjustsscale out

observesdependencies

listens to

Storage & Processing Concepts

Event

processes

UserAdmin

Workflow scheduler

Code TaskResource mgmt ServiceWorkflow

Information Concepts

Slot

Fragment

Storage System

SchemaRegistry

Schema

ContainerDataset

schedulesdispatches

submitsprologepilogue

triggersevents

upon change

reads, writes

registersRunson

Runs consists of11..*

providedb y

runsoptionallyprovided

b y

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 33 / 28

Page 34: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Predicting Non-Contiguous I/O Performance

Goal: Predict storage performance based on several parameters and tunables

Alternative models

� Predict performance based on parameters

� Predict best (data sieving) settings

PM

Input

Buffer SizeData SievingData SizeFill Level

Output

estimated Performance

Parameters

Buffer SizeData SievingData SizeFill Level

Observed Values

Performance

train

(a) Performance Model

PSMInput

Data SizeFill Level

Output

best Buffer Sizebest Data Sieving

Parameters

Buffer SizeData SievingData SizeFill Level

Observed Values

Performance

train

(b) Parameter Setting Model

Figure: PM provides a perf. estimate, whereas PSM provides the “tunable” variable parameters toachieve it

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 34 / 28

Page 35: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Extracting Knowledge

� Rules can be easily extracted from decision trees� Consider a performance prediction in three classes� Rules (this is common sense for I/O experts)

I Small fill levels and data sizes are slowI Large fill levels achieve good performance

� Surprising anomaly: smaller fill level, large access sizes are slower thanmedium

Figure: First three levels of the CART classifier rules for three classes slow, avg, fast ([0,25],(25,75], > 75 MB/s). The dominant label is assigned to the leaf nodes – the probability for eachclass is provided in brackets.

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 35 / 28

Page 36: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Prescriptive Analysis: Learning Best-Practises for DKRZ

� Performance benefit of I/O optimizations is non-trival to predict

� Non-contiguous I/O supports data-sieving optimization

I Transforms non-sequential I/O to large contiguous I/OI Tunable with MPI hints: enabled/disabled, buffer sizeI Benefit depends on system AND application

� Data sieving is difficult to parameterize

I What should be recommended from a data center’s perspective?

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 36 / 28

Page 37: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

System-Wide Defaults

� Comparing a default choice with the best choice� All default choices achieve 50-70% arithmethic mean performance� Picking the best default default for stripe count/size: 2 servers, 128 KiB

I 70% arithmetic mean performanceI 16% harmonic mean performance ⇒ some choices result in slow performance

Default Choice Best Worst Arithmethic Mean Harmonic MeanServers Stripe Sieving Freq. Freq. Rel. Abs. Loss Rel. Abs.

1 128 K Off 20 35 58.4% 200.1 102.1 9.0% 0.091 2 MiB Off 45 39 60.7% 261.5 103.7 9.0% 0.092 128K Off 87 76 69.8% 209.5 92.7 8.8% 0.092 2 MiB Off 81 14 72.1% 284.2 81.1 8.9% 0.091 128 K On 79 37 64.1% 245.6 56.7 15.2% 0.161 2 MiB On 11 75 59.4% 259.2 106.1 14.4% 0.152 128K On 80 58 68.7% 239.6 62.6 16.2% 0.172 2 MiB On 5 74 62.9% 258.0 107.3 14.9% 0.16

Table: Performance achieved with any default choice

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 37 / 28

Page 38: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Applying Machine Learning

� Building a tree with different depths� Even small trees are much better than any default� A tree of depth 4 is nearly optimal; avoids slow cases

Figure: Perf. difference between learned and best choices, by maximum tree depth, for DKRZ’sporting system

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 38 / 28

Page 39: Towards Intelligent Storage Systems - Limitless Storage ... · ITo have a future: must be beneficial for Big Data + Desktop, too Open board: encourage community collaboration Standard

Research Interest SDMI Model Prediction/Prescribing with ML

Decision Tree & Rules

Extraction of knowledge from a tree

� For writes: Always use two servers; For holes below 128 KiB ⇒ turn DS on,else off

� For reads: Holes below 200 KiB ⇒ turn DS on

� Typically only one parameter changes between most frequent best choices

Figure: Decision tree with height 4. In the leaf nodes, the settings (Data sieving, server number, stripe size)and number of instances for the two most frequent best choices

Julian M. Kunkel SH LIMITLESS POTENTIAL | LIMITLESS OPPORTUNITIES | LIMITLESS IMPACT 39 / 28