25
Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań, 2005 [email protected] [email protected] [email protected] [email protected] l

Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Embed Size (px)

Citation preview

Page 1: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Performance assessment of distributed SAN systems

Bartosz BelterArtur BinczewskiWojbor BogackiMaciej Brzeźniak

TERENA Networking Conference,Poznań, 2005

[email protected]@[email protected]@man.poznan.pl

Page 2: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Agenda

Introduction

Storage Networking challenges

IP Storage – new approach to build distributed SANs

IP Storage – experiments in Polish NREN PIONIER

Page 3: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Storage Networking

Storage Area Network is a high-speed special-purpose network (or subnetwork) that interconnects different kinds of data storage devices with associated data servers. Usually SANs are based on Fibre Channel or SCSI technology.

Storage Networking definition from SNIA

The practice of creating, installing, administering, or using networks whose primary purpose is the transfer of data between computer systems and storage elements and among storage elements.

Page 4: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Storage Networking – the importance

Currently focused on application aspect:

Local and remote mirroring, backups and disaster recovery Remote data replication Local and remote storage access

Explosion of Storage Data:

Data Warehousing statistics, charts, reporting

Internet web hosting e-commerce e-bussiness

Customer Relationship Management

Page 5: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Are separated SANs enough for high performance computing?

How to integrate remote, separated HPC centers in single, distributed, scalable high performance system?

HPC centers use different technology, not always applicable in backbone network traditional Storage Networking introduces additional limitation: maximum distance to

transfer data

Traditional Storage Networking technology

SCSI FC

Maximum cable length 25 meters if no more then 2 devices are used, otherwise 12 meters

30 meters device to device (copper), 10 000 meters device to

device (optical)

Maximum speed 2.560 Gbps up to 2.125 Gbps

(10 Gbps in the near future)

Maximum number of devices

16 126

Page 6: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

IP Storage

IP Storage is a new approach to extend existing Storage Area Networks using IP protocol, usually over Gigabit Ethernet.

According to SNIA, IP Storage is:

Computer systems and storage elements that are connected via Internet Protocol (IP)

The transport of storage traffic over an IP network

IP Storage traffic carries the traditional block I/O using SCSI protocols supported by most open systems

According to SNIA, IP Storage is not:

File-level transfer of data (i.e NAS)

Object level access (i.e. http, ftp)

Page 7: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

IP Storage protocols

Internet Small Computer Systems Interface (iSCSI)

iSCSI is a protocol which enables transfer of data-block traffic via IP network instead of a direct SCSI compatible bus. It uses a TCP layer and unlike other network storage protocols it requires only Ethernet interface to operate.

Internet Fibre Channel Protocol (iFCP)

iFCP is a new standard for extending Fibre Channel storage networks across the Internet. It provides a mechanism to deliver storage data to and from Fiber Channel storage devices over SAN infrastructure or even over the Internet using TCP/IP.

Fibre Channel Over IP (FCIP)

FCIP describes mechanisms that allow the interconnection of islands of Fibre Channel storage area networks over IP-based networks to form a unified storage area network in a single Fibre Channel fabric. FCIP relies on IP-based network services to provide the connectivity between the storage area network islands over local area networks, metropolitan area networks, or wide area networks.

Page 8: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

The experiment Tests were performed in Polish Optical Internet PIONIER

testbed interconnects 9 HPC centers

maximum distance length - over 1500 km

no QOS provided for FCIP traffic across WAN infrastructure. FCIP was tested based on production network

IP Storage vendor solutions used in tests:

CNT UltraNet Edge 3000

Cisco MDS 9216 and 8-port IP Storage Services Module

The main goals of the experiment:

to build the distributed data architecture based on new IP Storage technology

to verify IP Storage protocols (iSCSI and FCIP) used in live network environment

to evaluate the performance of IP Storage vendor solutions connected via Gigabit Ethernet

Page 9: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Testbed description

Page 10: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Testbed description

Hardware: PC Processor: Pentium 4 3.0 GHz Memory: 512 MB Hard Disc: Segate Baracuda 7200.7 SATA 

Western Digital Raptor WD740GD Gigabit Ethernet Controller Fibre Channel interface QLA 2340

IP Storage element Cisco - MDS 9216 CNT - UltraNet Edge 3000

RAID 0 includes two storage arrays

PCIP Storage

element

IP Storage element

RAID 0

Gigabit Ethernet switch

Gigabit Ethernet switch

Gigabit Ethernet

Page 11: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Benchmark software:

Windows 2000 HD Tach SiSoftware Sandra

Linux Suse 9.1 and 9.2 Bonnie IOZone HDParm IOMeter MySQL database benchmark Performance Benchmark from Tivoli SANergy

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

010010101001001010101001111010

Testing methodology

Page 12: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test resultsPerformance Benchmark from Tivoli SANergy

Reading performance MB/s

Test SitePoznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

10

20

30

40

50

60

70

80

90

100FCIP

iSCSI

Throughput

as it was expected the overall performance decreases, it has linear relationship with the distance

interconnection of distant HPC centers is possible even over 1500 km! (but the overall performance decreases twice)

Page 13: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test resultsPerformance Benchmark from Tivoli SANergy

Reading/Writing performance (Write Acceleration option)

Write Acceleration

MB/s

Test SiteWrocław390 km

Bielsko Biała740 km

Białystok1540 km

102030405060708090100

FCIP

FCIP

Reading Throughput

MB/s

Test Site

102030405060708090100

Writing Throughput

some vendors introduce their own improvements to protocols - CISCO implements "Write Acceleration" (WA) feature

WA has not affected the reading performance

WA introduces interesting results for writing performance – in Białystok (1540 km) writing performance increases twice in comparison to standard FCIP transmission

Wrocław390 km

Bielsko Biała740 km

Białystok1540 km

Poznań0 km

Poznań0 km

Page 14: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test results%CPU

Test Site

FCIP

iSCSI

IOMeter: Reading - CPU load

0

100

80

20

40

60

Poznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

iSCSI software driver introduces higher CPU load than FCIP (handled by a hardware)

Page 15: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test resultsCopying of 700MB raw data

sec

Test Site

2.5

5

7.5

10

12.5

15

17.5

20

22.5

25FCIP

iSCSI

Time

Poznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

good linear relationship with the distance

Page 16: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test results – MySQL benchmark

MySQL – popular Open Source Relational Database

benchw – simple benchmark for relational databases (http://benchw.sourceforge.net)

DB Tables:

fact01: 1,02 GB - 10mln records, dim1: 0,24MB - 10k records,

dim0: 0,24 MB - 10k records, dim2 1,40MB - 10k record

Query types:

Loading data into the database: all tables

Q0: select from 2 tables, 2 cond. (dim0 & fact01, “=”, “<>”, numbers)

Generating indexes for the table: all tables

DB & DB filesystem recreated each time

Page 17: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test results

sec

Test Site

FCIP

iSCSI

TimeMySQL database benchmark

0

100

80

20

40

60

Loading data to database server

Poznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

load to database performs sequential reading of input file and putting data into the db structure

operation performance scales linearly with the distance

Page 18: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test resultsMySQL database benchmark

sec

Test Site

0

20

40

60

Query no 0

Time

FCIP

iSCSI

Poznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

operation reads from two database tables only

even non-complicated query introduces decrease of performance in comparison between local and remote measurements

Page 19: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test resultsMySQL database benchmark

sec

Test Site

Index generatingTime

FCIP

iSCSI

Poznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

0

200

400

600

operation reads from all tables stored in database and writes small amount of data (generated indexes)

more complicated request introduces significant decrease of performance in comparison between local and remote measurements

Page 20: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test resultsdd command – FCIP vs. iSCSITime

sec

Test Site

5

10

15

20

25

30

35

40

45

50

55

4096

16384

32768

131072

Block Size

FCIP

iSCSI

Poznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

Page 21: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test resultsdd command – FCIP vs. iSCSITime

sec

Test Site

5

10

15

20

25

30

35

40

45

50

55

4096

16384

32768

131072

Block Size

FCIP

iSCSI

Poznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

Page 22: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Test resultsdd command – FCIP vs. iSCSITime

using block size 4kB, 16kB or 32kB there are no significant differences between iSCSI and FCIP protocols

the greater block size – the better performance, but ...

too large block size decreases overall performance (block size > raid chunk size)

sec

Test Site

5

10

15

20

25

30

35

40

45

50

55

4096

16384

32768

131072

Block Size

FCIP

iSCSI

Poznań0 km

Zielona Góra161 km

Wrocław390 km

Opole500 km

Katowice650 km

Bielsko Biała

740 km

Kraków850 km

Radom1090 km

Białystok1540 km

Page 23: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

IP Storage – tuning up the transmission

configured TCP parameters:

TCP Maximum Window Size (default: 64 Kbytes, maximum: 32 Mbytes)

MWS > B x D example: Gigabit Ethernet Network, RTT = 10 ms

B – end to end bandwith MWS > 1000 x 10 bit/sec x 10 x 10 sec

D – round trip time MWS > ~1,2 Mbytes

TCP Selective Acknowledge

TCP SACK helps TCP connections that are extended over long distances to recover from any sort of frame loss that may occur

MTU set to 2148 bytes on IP Storage devices

for iSCSI protocol - hardware TCP Offload Engine was not tested

for FCIP protocol – FCIP Compression was not tested

6 -3

Page 24: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

IP Storage – conclusions

As it was expected the overall performance decreases, it has linear relationship with the distance (latency)

Assuming linear characteristic – it’s possible and easy to predict how overall performance decreases with the increase of distance (latency):

for each 100 km of distance -> performance decreases about 4 MB/s for every 1 ms of latency -> performance decreases about 3 MB/s

Interconnection of far HPC centers is possible even over 1500 km! (but the overall performance decreases twice)

Write Acceleration feature considerably increases writing performance

iSCSI software driver used in tests could really affect the iSCSI performance, especially for short distances

Interoperability

Even if IP Storage protocols published by IETF – still an important issue!

Page 25: Performance assessment of distributed SAN systems Bartosz Belter Artur Binczewski Wojbor Bogacki Maciej Brzeźniak TERENA Networking Conference, Poznań,

Thank you!