27
Multiprotocol Locking and Lock Failover in OneFS Aravind Srinivasan EMC, Isilon Storage Division

Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Multiprotocol Locking and Lock Failover in OneFS

Aravind Srinivasan EMC, Isilon Storage Division

Page 2: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

Agenda

Overview OneFS Overview Overview of the DLM in OneFS Multiprotocol Locking in OneFS Lock Failover in OneFS

2

Page 3: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

Overview

Any clustered file system (like Isilon’s OneFS) needs a robust Distributed Lock Manager (DLM) to synchronize resources accessed from different protocol clients (such as SMB and NFS).

Also we need a failover mechanism to implement the failover semantics of these protocols so that the locks are not lost even when a node in the cluster goes down.

3

Page 4: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

OneFS Overview

4

Page 5: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

EMC - Isilon OneFS Cluster

NAS file server Scalable

Add more storage in 5 mins Reliable

8x mirror / +4 parity Striped across nodes

Single volume file system 3 to 144 nodes Fully symmetric peers

No metadata servers Commodity hardware

CPU, Mem, Disks

5 5

Page 6: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

EMC - Isilon OneFS File System

Concurrent access to all files with all protocols SMB1/SMB2 NFSv3/NFSv4 SSH HTTP/FTP

6 6

Page 7: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

OneFS – High Level Overview

OneFS is EMC-Isilon's seventh-generation operating system that provides the intelligence behind all EMC-Isilon scale-out storage systems.

It combines the three layers of traditional storage architectures—file system, volume manager and RAID—into one unified software layer, creating a single intelligent file system that spans all nodes within a cluster.

7

Page 8: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

OneFS – High Level Overview

Isilon's OneFS enables: Independent or linear scalability of

performance and capacity A single point of management for large and

rapidly growing repositories of data Mission-critical reliability and high availability

with state-of-the-art data protection

8

Page 9: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

OneFS DLM Overview

9

Page 10: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

OneFS Volume

10

DLM In OneFS

Goal of DLM

/ifs/somefile

2

EX-lock EX-lock

Lk resource

1

write write

Lk resource File contents intact

DLM module (lk)

Page 11: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

DLM in OneFS - Overview

The DLM in OneFS is called LK and is split into two distinct roles: Initiator and Coordinator

11

Page 12: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

LK - Coordinator

Locks are coordinated in lk. Each resource is coordinated by a particular

node. The lk coordinator node arbitrates locking within the cluster for a particular subset of resources.

The coordinator is chosen by a numeric transformation of the resource ID, in the simplest case, the ID modulo the number of nodes in the cluster.

12

Page 13: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

LK - Initiator

The initiator is the one requesting the lock. On the initiator side, there is one entry for each

resource for which there is a local owner or waiter.

13

Page 14: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

LK – Coordinated Two-Tier Locking

14

Initiator 1 Coordinator Initiator 2

Need: EX, Wants: None

Req(R, X, N)

Holds: EX, Goal: EX Grant(R, X, X)

Req(R, X, N)

Holds: EX, Goal: NONE Grant(R, X, N)

Release(R)

Holds: EX, Goal: EX Grant(R, X, X)

Page 15: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

The same file data is accessed by both UNIX/Linux and Windows users concurrently

Multiprotocol File Sharing Environments

UNIX Domain NAS

Windows Domain

Page 16: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

Multiprotocol Locking in OneFS

Locks in LK are coordinated per domain There can be multiple lock domain in existence at any time,

each one controlling locks for a different aspect of the system. Eg: OPLOCK domain/CBRL domain

Locks within a domain contend with each other This concept of domain enables OneFS to

implement multiprotocol locking support

16

Page 17: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

Multiprotocol Locking in OneFS

We can define any LK domain and share it between protocols to make them contend with each other.

OneFS tries to coordinate the share modes from NFSv4 and SMB clients by having a domain shared by both the protocols.

17

Page 18: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

OneFS Lock Failover Overview

18

Page 19: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

Lock Failover in OneFS

There are protocols like NFS which require locks to stay even when a node in the cluster goes down.

LK on its own, is a pure DLM without any failover semantics. So if a node goes down all it’s LK locks will be lost.

In order to implement lock failover, OneFS has a component called LKF, which is a consumer of LK with failover support.

19

Page 20: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

LKF in OneFS

LKF Terms LKF Initiator – The node to which the client is

connected. LKF Primary Delegate – The node which talks

to LK to get the locks on behalf of the client. This is chosen by hashing the client name with the number of nodes in the cluster.

LKF Backup Delegate(s) – The node(s) which stay in sync with the Primary Delegate

20

Note: These are different from the LK Coordinator and initiator

Page 21: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

OneFS LKF - Overview

Client

LKF Initiator LKF Backup Delegate(1)

LKF Primary Delegate

LKF Backup Delegate(2)

C1/ID1/F1/Type C1/ID1/F1/Type C1/ID1/F1/Type

Page 22: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

LKF in OneFS

Failover Scenario Node with the lock (Primary Delegate) goes

down. As part of the group change: The API is first suspended to prevent any

new requests from coming in. Backup Delegates take over as primary

and get the locks for the client. Once this is done, the API is resumed.

22

Page 23: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

LKF in OneFS - Contd

The Primary and the Backup Delegates must always stay in sync.

When a node goes down, one of the backup delegates will take over as the primary and get the locks held by the client.

Currently used only by NFS clients in OneFS.

23

Page 24: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

LKF Challenges

LKF can be extended to support other protocols like SMB3.

The main challenge is to confirm to the protocol specific requirements and tweak the system accordingly.

This can be extended to failover other information in addition to locks as well by having a blob of data to be failed over rather than just the lock.

24

Page 25: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

LKF Challenges

LKF can be extended to support other protocols like SMB3.

The main challenge is to confirm to the protocol specific requirements and tweak the system accordingly.

This can be extended to failover other information in addition to locks as well by having a blob of data to be failed over rather than just the lock.

25

Page 26: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

Summary

Distributed locking in OneFS is achieved by using a OneFS specific DLM called LK

The domain concept in LK allows the potential to enable multiprotocol locking support in OneFS

Lock Failover in OneFS is achieved using a system called LKF which is a consumer of LK.

26

Page 27: Multiprotocol Locking and Lock Failover in OneFS...2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2013 Storage Developer Conference. © EMC Isilon

2013 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

2013 Storage Developer Conference. © EMC Isilon Storage Division. All Rights Reserved

Questions?

Contact

Aravind Srinivasan [email protected]

27