Clusters With Glusterfs

Preview:

Citation preview

Marian Marinov - mm@yuhu.bizSystem Architect - Siteground.com

HighLoad++ 2008 Moscow

Clusters with GlusterFS

Moscow 06-07.Oct.2008

2

Moscow 06-07.Oct.2008

Agenda

Cluster Filesystems

Some facts

Gluster Design

➢ kernel ➢ gluster engine

➢ protocols➢ translators➢ storage➢ performance➢ others

➢ schedulers

Some benchmarks

1/22d

3

Moscow 06-07.Oct.2008

Cluster Filesystems

2/22

4

Moscow 06-07.Oct.2008

Facts

GlusterFS project starts in August 2006

It is not actual Filesystem

Server only for LinuxClient running on Linux & FreeBSD

Very scallable

Very easy to install and maintain

3/22

5

Moscow 06-07.Oct.2008

GlusterFS Desgin

4/22

6

Moscow 06-07.Oct.2008

GFarm Desgin

5/22

7

Moscow 06-07.Oct.2008

Gluster Filesystem Design

In the kernel

➢ Requires FUSE➢ FUSE as module➢ GlusterFUSE

The engine

➢ Server & Client➢ Transport Modules➢ Translators➢ Scheduler Modules

6/22

8

Moscow 06-07.Oct.2008

GlusterFS Desgin

7/22

9

Moscow 06-07.Oct.2008

GlusterFS Desgin

8/22

The picture explained:

ClientX:

volume serverX - defines a name for a remote serversubvolumes brick0 - defines in which of all exported volumes from

the remote server we are interested

some performance translators

volume unify - defines that we will use unify cluster translatorsubvolumes serverX serverY - defines which already connected storage volumes will be used

10

Moscow 06-07.Oct.2008

Gluster Filesystem Design

Transport Modules:For TCP/IP transport

transport-type tcp/serverFor Infiniband SDP transport

transport-type ib-sdp/serverFor Infiniband Verbs transport

transport-type ib-verbs/server

9/22

11

Moscow 06-07.Oct.2008

Gluster Filesystem Design

The idea – GNU/Hurd

Translators

➢ Performance➢ Clustering ➢ Scheduling➢ Storage➢ Others

10/22

12

Moscow 06-07.Oct.2008

Gluster Filesystem Design

Performance translators

➢ Read Ahead➢ Write Behind➢ Threaded I/O➢ IO-Cache➢ Stat Pre-fetch – still not ported to the new versions➢ Booster

11/22

13

Moscow 06-07.Oct.2008

Gluster Filesystem Design

Clustering translators

➢ Stripe➢ Unify➢ AFR

12/22

14

Moscow 06-07.Oct.2008

Gluster Filesystem Design

Scheduling translators

➢ Adaptive Least Usage (ALU)➢ Non-uniform filesystem architecture (NUFA)➢ Random➢ Rand-Robin➢ Switch

13/22

15

Moscow 06-07.Oct.2008

Gluster Filesystem Design

Adaptive Least Usage (ALU)

➢ disk-usage➢ read-usage➢ write-usage➢ open-files-usage➢ disk-speed-usage

14/22

16

Moscow 06-07.Oct.2008

Gluster Filesystem Design

Non-uniform filesystem architecture (NUFA)

➢ local-volume-name➢ limits.min-free-disk

Random

➢ limits.min-free-disk

Round-Robin

➢ limits.min-free-disk➢ read-only-subvolumes➢ refresh-interval

15/22

17

Moscow 06-07.Oct.2008

Gluster Filesystem Design

Switch

➢ switch.case *jpg:brick1,brick2;*mp3:brick3;*:brick4,brick5➢ switch.read-only-subvolumes brick7

16/22

18

Moscow 06-07.Oct.2008

Gluster Filesystem Design

Other translators

➢ client➢ server➢ posix ➢ posix-locks➢ bdb - very new➢ rot-13➢ trace

17/22

19

Moscow 06-07.Oct.2008

Gluster Filesystem Design

In the feature

➢ Live addition/removal of nodes➢ Automatic File Reordering➢ Web GUI ➢ mod_glusterfs

18/22

20

Moscow 06-07.Oct.2008

Gluster Design

19/22

21

Moscow 06-07.Oct.2008

Benchmarks

Aggregated Read Throughput Benchmark

Multiple dd utility were executed simultaneously with different block sizes to read from GlusterFS filesystem.

4KB 16KB 128KB 256KB 512KB 1024KBLustre 1,796 MB/s 5,782 MB/s 20,423 MB/s 21,582 MB/s 22,789 MB/s 23,731 MB/sGlusterFS 11,415 MB/s 11,424 MB/s 11,427 MB/s 11,419 MB/s 11,411 MB/s 11,409 MB/s

Aggregated Write Throughput Benchmark

Multiple dd utility were executed simultaneously with different block sizes to write to GlusterFS filesystem.

4KB 16KB 128KB 256KB 512KB 1024KBLustre 969 MB/s 1,613 MB/s 1,988 MB/s 1,989 MB/s 1,984 MB/s 1,983 MB/sGlusterFS 1,886 MB/s 2,191 MB/s 2,237 MB/s 2,231 MB/s 2,236 MB/s 2,223 MB/s

Note: Higher means faster.

20/22

22

Moscow 06-07.Oct.2008

Benchmarks

Apache Web Server Benchmark

Apache served 12039 files (595 MB) over HTTP protocol. wget client fetched the files recursively.

TimeLustre Failed after downloading 33 MB out of 585 MB in 11 mins.GlusterFS 3 mins 11 secs

Archive Creation

'tar utility created an archive of 12039 files (595 MB) served through GlusterFS.Time

Lustre 41 secsGlusterFS 25 secs

Archive Extraction

TimeLustre FAILED No space left on device.GlusterFS 43 secs

Note: Lower means faster.

21/22

23

Moscow 06-07.Oct.2008

Sources of Information

Project's site:http://www.gluster.org

Official GlusterFS documentation wiki:http://www.gluster.org/docs/index.php/GlusterFS

On IRC:irc.freenode.net #gluster

The mailing list:gluster-devel@nongnu.org

22/22

24

Moscow 06-07.Oct.2008

Clusters with GlusterFS

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Questions ? ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

Recommended