23
sambaXP 2020 Report from the field: Samba clustering with GlusterFS Anoop C S <[email protected]> Günther Deschner <[email protected]> 1

GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

sambaXP 2020

Report from the field: Samba clustering with GlusterFS

Anoop C S

<[email protected]>

Günther Deschner

<[email protected]>

1

Page 2: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

sambaXP 2020

Agenda

2

Report from the field: Samba clustering with GlusterFS

Deployment report

● Clustering issues● Correctness problems● Performance limitations

Overview

● Samba and GlusterFS● Red Hat Gluster Storage

Moving forward

● Improvements in queue● Future goals and SMB3

Page 3: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Samba, GlusterFS and Red Hat Gluster Storage (RHGS)

Report from the field: Samba clustering with GlusterFS

3

Page 4: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Samba + GlusterFS

Red Hat Gluster Storage (RHGS)

4

Gluster

Samba, GlusterFS and Red Hat Gluster Storage (RHGS)

● Gluster is a free and open source software scalable

network filesystem

● On-premise, public and private cloud deployments

● Replication, Quota, Snapshots, geo-replication

● Runs on every FS that support extended attributes

● FUSE fs client

● Ansible automation

● Supports variety of access protocols including NFS and

SMB

Page 5: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Samba + GlusterFS

Red Hat Gluster Storage (RHGS)

5

Gluster architecture I

Samba, GlusterFS and Red Hat Gluster Storage (RHGS)

● Bricks are low level components

● Multiple bricks → volumes

● Various different volume types:

○ Distribute

○ Replicate

○ Arbiter

○ Disperse

● volfile configuration

Page 6: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Samba + GlusterFS

Red Hat Gluster Storage (RHGS)

6

Gluster architecture II

Samba, GlusterFS and Red Hat Gluster Storage (RHGS)

● Client and server translator stack

(similar to the Samba VFS layer)

● md-cache translator primarily important for Samba

● Performance enhancements specific for Samba:

gluster volume set <volume>

performance.cache-samba-metadata on

Page 7: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Samba + GlusterFS

Red Hat Gluster Storage (RHGS)

7

Samba & CTDB

Samba, GlusterFS and Red Hat Gluster Storage (RHGS)

● Layered product:

Samba version *always* ahead of RHEL Samba version

● Two options for using gluster:

○ vfs_glusterfs module, consuming gfapi (default)

○ Fuse re-export

● macOS clients using vfs_fruit

(now fully supported with non-local FS as well)

● CTDB for HA of Samba/Winbind and public IPs

● CTDB uses distinct glusterfs volume for recovery

lockfile

Page 8: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Common problems from customer setups

Report from the field: Samba clustering with GlusterFS

8

Page 9: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

UNHEALTHY cluster

Failure to acquire recovery lock for CTDB

9

Chances of occurrence: frequent

Common problems from customer setups

● Node reboot

● Unavailability of common recovery lock

○ Accidental creation of recovery lock file locally

● Availability of shared storage post-reboot

● Automatic mounting of shared volume holding recovery

lock(/etc/fstab)

○ Dependency on glusterd(GlusterFS daemon)

● CTDB systemd service file modifications

Page 10: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

IO interruption and/or disconnect

Misconfigured public/private network separation

10

Chances of occurrence: rare

Common problems from customer setups

● Cable unplugged and node down/reboot scenarios

● Durable handle reconnect

● Separate GlusterFS and CTDB traffic

○ Network teaming?

● Resolving hostnames, if used, to correct network

interface

● External client facing IP on a different network

Page 11: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

NT_STATUS_ACCESS_DENIED for users

Diverging netbios namespaces on cluster nodes

11

Chances of occurrence: rare

Common problems from customer setups

● AD Domain membership:

machine account credentials in CTDB

○ Only join AD on one node!

○ Not having same “netbios name” set splits common

cluster account into individual node accounts

● Standalone setup:

local SAM is shared via CTDB

○ Not having same workgroup and netbios name on all

nodes creates diverging SIDs

○ Result: ACL authorization failures

Page 12: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

NT_STATUS_ACCESS_DENIED for users

Incorrect usage of POSIX permissions

12

Chances of occurrence: common

Common problems from customer setups

● Setting up ACLs on share root

○ Enabling -o acl mount option of GlusterFS FUSE mount

● Existence of default ACLs on directory

● Using vfs_acl_xattr

● Special treatment with ‘ignore system acls = yes’

● Problems with switching same share with and without

vfs_acl_xattr

Page 13: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Understanding performance bottlenecks

Report from the field: Samba clustering with GlusterFS

13

Page 14: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Performing better with a distributed file system backend

Primary focus areas

14

Broadly classified into IO and metadata related workloads

Understanding performance bottlenecks

▸ Mostly getxattr and stat calls

▸ Frequent invocations

▸ Working with a distributed file system underneath

▸ Improvements with get_real_filename and

md-cache translator

Takeaway: GlusterFS client side caching for metadata heavy requests

▸ Basically involves read/write operations

▸ Writes should reach bricks if online

▸ Previous implementation with libgfapi async APIs

▸ Limitations

▸ Using pthreadpool infrastructure

Takeaway: Parallelism and asynchronous nature in scheduling read/write operations

Page 15: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Comparison in numbers

IO benchmarking with Disk Performance tool

15

Throughput measured for read-write operations

Understanding performance bottlenecks

READ WRITE

Samba on top of 12x(2+1) Arbiter volume spread across 3 nodes with CTDBBlock count = 16K, Thread count = 12, Mode = random

Page 16: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Comparison in numbers

Plain listing of large directories

16

Delay in displaying contents of a directory with large number of small files

Understanding performance bottlenecks

Samba on top of 12x(2+1) Arbiter volume spread across 4 nodes with CTDBClient = smbclient, Number of 1KB files = 16K

Samba on top of 12x(2+1) Arbiter volume spread across 4 nodes with CTDBClient = smbclient, Number of 1KB files = 16K

Page 17: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

More performance boost in queue

Recursive listing of large directory tree

17

Time taken for file system crawl

Understanding performance bottlenecks

▸ Relatively large directory structure

・ with depth level of 10, 100 etc..

▸ Proposed solution from GlusterFS

・ implemented within distribution layer

・ prefetching logic to fill readdir buffer

・ performance.readdir-cache volume set option

▸ Native client improvements around 100%

▸ Samba integration?

・ yet to explore :-) but hopeful

Page 18: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Miscellaneous improvements

Report from the field: Samba clustering with GlusterFS

18

Page 19: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Avoiding data corruption with re-export of FUSE mount

New glusterfs_fuse VFS module

19

Added advantage of get_real_filename

Miscellaneous improvements

▸ Motivation: absence of get_real_filename previously

▸ Performance improvement with creation of files

・ Exposed in gluster via xattr: glusterfs.get_real_filename:<filename>

▸ Problems with file_id calculation

・ Difference in Device-Id on FUSE mounts from cluster nodes

▸ Reuse of existing logic (vfs_fileid)

▸ New module

・ Last one in the stack, No additional options

・ Easy to use without any overhead

▸ Still involves FUSE context switches

Page 20: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

SMB_VFS_FCNTL

New VFS interface for fcntl()

20

Directed towards handling of open file descriptor flags

Miscellaneous improvements

▸ Motivation: actually a regression seen with GlusterFS

・ O_NONBLOCK set bypassing VFS

▸ Contact VFS for handling fd flags

▸ Introduction of SMB_VFS_FCNTL

・ Complexity involved with different types of fcntl flags

・ Automatic detection of flags

・ May be the only VFS interface macro with variable arguments !

▸ Just one caller?

▸ Consumed by vfs_glusterfs with a hack

Page 21: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Ongoing developments around Samba integration

Report from the field: Samba clustering with GlusterFS

21

Page 22: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

Going ahead. . .

Next steps

22

General upcoming developments

Ongoing developments around Samba integration

▸ Correctness in durable handle reconnect with GlusterFS

▸ Slow directory listing (large number of entries)

▸ Proxy mode with vfs_glusterfs/gfapi to limit memory consumption

▸ SMB3 Multichannel

・ Oplock/lease replay

・ Enabling socket_wrapper for multichannel self test

・ Integration with CTDB

▸ Witness

・ Dependency on async DCE/RPC server

・ Prototype implementation done in 2015

▸ Automation/CI

Page 23: GlusterFS Samba clustering with Report from the field · 2020. 6. 15. · Samba, GlusterFS and Red Hat Gluster Storage (RHGS) Client and server translator stack (similar to the Samba

linkedin.com/company/red-hat

youtube.com/user/RedHatVideos

facebook.com/redhatinc

twitter.com/RedHat

Questions?

23