30
© 2018 All rights reserved. World’s Fastest File System 1 Using D as the programming language of choice for large scale primary storage system Liran Zvibel WekaIO , CEO & Co-Founder @liranzvibel

Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO [email protected] @liranzvibel 6

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved.

World’s Fastest File System

�1

Using D as the programming language of choice for large scale primary

storage systemLiran Zvibel

WekaIO , CEO & Co-Founder@liranzvibel

Page 2: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �2

Agendao History and backgroundo WekaIO introo Where we stand nowo Mecca unveiledo Q&A

Page 3: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �3

History and background

Page 4: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �4

Page 5: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �5

Page 6: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

Using D for Development of Large Scale Primary Storage

Liran ZvibelWeka.IO, [email protected]@liranzvibel �6

Page 7: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

Using D for Development of Large Scale Primary Storage

Liran ZvibelWeka.IO, [email protected]@liranzvibel �7

#DConf2016

Page 8: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �8

After DConf 2015 …o David Nadlinger came to the rescue and fixed LDC for uso Were able to combat optimizations and runtime issueso Started working towards no-GC runtimeo Code size and complexity started hitting us (symbol length,

compilation time, exe size, etc)o Johan Engelen stepped in to maintain LDC for us and bridge

our work with DMD

Page 9: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �9

Page 10: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �10

Short summaryo The D language is proving to be critical to our successo WekaIO Matrix is a large and complex projecto D Language allows us to have a single language and codebase

for data path and also control planeo Introspection, CTFE and meta programming allow us to

manage complexity of the projecto Could improve support for large projects, and also use cases

that require real time (not just java or python that compiles) around safety and GCo No programming language is perfect, though!

Page 11: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �11

WekaIO introduction

Page 12: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �12

THE PEOPLE

WekaIO Introduction

THE ACCOLADESTHE PARTNERS

WekaIO Matrix is the fastest, most scalable parallel file system for AI and technical compute workloads that ensures your applications never wait for data.

WHO WE ARE

Page 13: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �13

Premium Customers

WekaIO demonstrated that it was the only file system that could fully saturate the GPU cluster. With WekaIO, the data scientists were able to significantly improve productivity by removing time consuming data copy tasks into local disks. In addition WekaIO provided seamless integration to their massive training system data lake

Page 14: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �14

Page 15: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �15

Highest Performance Primary Resilient Storage at Scale

Cloud Object Store

WekaIO

SAN

Scale-out Parallel NAS

AFA

Perfo

rman

ce

Scale and Value

Scale-out NAS

All Flash NAS

Speed

Simplicity

Scalability

o Primary Resilient Storageo Massive Scale– Trillions of Files– 100's of Petabytes– Millions of IOPS– 100’s of GB of BW

o Lowest latency FS, higher perf than AFA

o Cloud Economics

Page 16: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �16

WekaIO Matrix: Full-featured and Flexible

WekaIO Matrix Shared File System

Fully Coherent POSIX File System That Delivers Local File System Performance

Distributed Coding, More Resilient at Scale, Fast Rebuilds, End-to-End DP

Instantaneous Snapshots, Clones, Tiering to S3, Partial File Rehydration

InfiniBand or Ethernet, Hyperconverged or Dedicated Storage Server

Public or Private

S3 Compatible

Bare Metal Cloud Native

Page 17: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �17

Focused On the Most Demanding Workloads

• Semiconductor verification• Manufacturing (CFD)• Software compilation

• Autonomous cars• Machine Learning & AI• IoT

• Genomics sequencing and analytics• Drug discovery• Microscopy

• Business analytics (SAS Grid, SAP HANA)• Algorithmic trading • Risk analysis (Monte Carlo simulation)

• DevOps• Real-time analytics• Batch analytics

• Media rendering• Transcoding• Visual Effects (VFX)

Page 18: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �18

Why Data Locality is Irrelevanto Local copy architectures (e.g. Hadoop, or caching solutions) were developed when

1GbitE and HDDs were standardo Modern networks on 10Gbit Ethernet are 10x faster than SSDo It is much easier to create distributed algorithms when locality is not importanto With right networking stack, shared storage is faster than local storage

Time it takes to Complete a 4KB Page Move

SSD Read

SSD Write

10Gbit (SDR)

100Gbit (EDR)

0 25 50 75 100

Microseconds

Page 19: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �19

Software Architectureo Runs inside LXC container for

isolationo SR-IOV to run network stack and

NVMe in user spaceo Provides POSIX VFS through

lockless queues to WekaIO drivero I/O stack bypasses kernelo Scheduling and memory

management also bypass kernelo Metadata split into many Buckets –

Buckets quickly migrate ➔ no hot spots

o Support, bare metal, container & hypervisor

Clustering

Balancing

Failure Det.& Comm.

IP Takeover

ApplicationApplication

ApplicationApplication

Frontend

SSD Agent

H/W

User Space

KernelWeka DriverTCP/IP Stack

DistributedFS “Bucket”Distributed

FS “Bucket”DistributedFS “Bucket”Distributed

FS “Bucket”Distributed FS “Bucket”

DataPlacement

FSMetadata

TieringBack

end

Networking

NFS

S3

SMB

HDFS

Page 20: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �20

Actual Results from Deep Learning Bake-off7x Faster 1MB Throughput

1GB/sec

7.1GB/sec

3.3x Better FS Traverse (Find)

6.5 Hours

2 Hours

5.5x Better ‘ls” Directory

55 seconds

10 seconds

2.6x Better 4KB IOPS/Latency

61K IOPS670µsec latency

165K IOPS271µsec latency

Page 21: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �21

Fastest File System

SPEC 2014 Public Posted Results

0

0.75

1.5

2.25

3

60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200

IBM Spectrum Scale NetApp WekaIO

Late

ncy

(milis

econ

d)

1200# Concurrent Software Builds

WekaIO does 2x the workload of IBM Spectrum Scale

Running from RAM cache

Page 22: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �22

Current state of the project

Page 23: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �23

Some statisticso 1232 .d fileso About 280 KLOCo About 2k ‘static if’ statementso 20 ‘static foreach’ statementso Probably many more foreach indeed static

o 115 ‘mixin template’ o About 27,500 explicit template instantiations (with ‘!’)o 30 mentions of ‘__ctfe’ in code, countless usage of actual

Page 24: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �24

Anecdotal cool example — verifying ABI for RPCo Enterprise systems must support seamless upgradeso Upgrades are performed as a “rolling”processo Two versions must know whether RPC is ABI compatible or

not.o Standard mangling is not enough, as types may have

changed between versionso Introspection allows our no-IDL RPCs to automatically verify

ABI compatibility by recursively opening structs and hashing the whole result

Page 25: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �25

Anecdotal pain point — delegates, scope and GCo GC cannot be used in a real time, low latency based systemo Delegates generate GC by default, as their scope may

escape the current one (we cannot know that the stack remains in the scope)

o Even simple std.algorithm examples, where all executing is recursive and would stay on the stack force GC allocations

o No effective way of marking such delegates as scoped so this won’t happen

Page 26: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �26

What do we care about?o Safetyo Performanceo Brevityo Ability to manage complexity

o What we don’t need and others do : “First 5 minutes!”o Community must get D easier to start with

Page 27: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �27

Mecca Unvailed

Page 28: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �28

Again, some historyo Work started in August 2016 by Tomer Filiba

commit 51182a64360518aa4cbabfe1ce99561d2584378aAuthor: Tomer Filiba <[email protected]>Date:   Mon Aug 29 23:50:53 2016 +0300

    Mecca: make weka's infrastructure great again

o Moved to external repository May 2017o Shachar Shemesh started working full time June 2017

o Mecca is our OS implementation, sans IO and networking modules

Page 29: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �29

Some statisticso 3 major components: Reactor, lib, containerso 20575 LOC: 8361 in reactor; 7782 in lib, 4432 in containers

o Reactor — scheduling fibers coordinating (synchronizing)o non-GC containers — Arrays, pools, queues, linked listso Lib — introspection, division, no-gc exception handling,

CTFE enabled hashing, non-gc interators and algs, string and time manipulation.

Page 30: Using D as the programming language of choice for large scale … · Using D for Development of Large Scale Primary Storage Liran Zvibel Weka.IO, CTO liran@weka.io @liranzvibel 6

© 2018 All rights reserved. �30