12
University of Mannheim 1 ATOLL ATOmic Low Latency R ATOLL ATOmic Low Latency – ATOmic Low Latency – A high-perfomance, low A high-perfomance, low cost SAN cost SAN Patrick R. Haspel [email protected] Computer Architecture Group University of Mannheim, Germany

University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel [email protected] Computer Architecture Group

Embed Size (px)

Citation preview

Page 1: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 1

ATOLL

ATOm ic Lo w Latency

RATOLL

ATOmic Low Latency – ATOmic Low Latency – A high-perfomance, low cost SANA high-perfomance, low cost SAN

Patrick R. [email protected]

Computer Architecture GroupUniversity of Mannheim, Germany

Page 2: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 2

ATOLL

ATOm ic Lo w Latency

R

Cluster Computing

• Cluster Computing evolves as a new way of High Performance Computing as result of its superior price/performance ratio

• the key to Cluster Computing is a SAN delivering the communication performance normally found in Supercomputers

• several SANs have been developed in the last years:

ServerNetMemory Channel

QsNetSCI ATOLL

ATOm ic Lo w Latency

R

Page 3: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 3

ATOLL

ATOm ic Lo w Latency

R

ATOLL Basic Architecture

ATOLL-ChipATOLL-Chip

4,5 Mio transistors

0.18µm CMOS process

5,7 x 5,7 mm Chip

PC I-XInte rf ac e

64 bit/13 3 M H z

Ho st Port 0

4x 4

Ful l-d uplex

Xba r

Link0

Link1

Link2

Link3

ATO LL -L ink

PC I- X -Bu s

ATOLL Top-Level Block Diagram

Ho st Port 1

Ho st Port 2

Ho st Port 3

Fastest and Second Biggest Design of a European University

Page 4: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 4

ATOLL

ATOm ic Lo w Latency

ROptimization for Performance and Cost

Costs in Percent

PCB

Connector HDRA

Discrete Comp.

div. Chips

ATOLL chip

Link Cable

Package (BGA)

Mechanics

Soldering

Test

Components %PCB 1,44Connector HDRA 2,65Discrete Comp. 1,00div. Chips 3,52ATOLL chip 38,30Package (BGA) 4,30Link Cable 38,29Mechanics 1,56Soldering 2,55Test 6,38

Page 5: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 6

ATOLL

ATOm ic Lo w Latency

RATOLL Performance

PC I-XInte rf ac e

64 bit/13 3 M H z

Ho st Port 0

4x 4

Ful l-d uplex

Xba r

Link0

Link1

Link2

Link3

ATO LL -L ink

PC I- X -Bu s

ATOLL Top-Level Block Diagram

Ho st Port 1

Ho st Port 2

Ho st Port 3

DMA-Mode Test Test system: P3-1000 (Serverworks)PCI 66/64bit ATOLL@245MHz

SWsendSWsend

SWreceiveSW

receive

4µs 3,8µs1,2µs

Not fully optimized yetNot fully optimized yet

533MB/s write burst rate

137MB/s read burst rate (bridge problem w. stop)

240 Byte Message

Sum 9µs

Page 6: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 7

ATOLL

ATOm ic Lo w Latency

RATOLL Performance

A module has been developed in collaboration with the Universityof Mannheim to evaluate their ATOLL network cards. Thisexperimental hardware delivers the best performance for messagessmaller than 10 kB, and matches the 2 Gbps throughput seen withmany proprietary solutions like SCI and Myrinet.

Page 7: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 8

ATOLL

ATOm ic Lo w Latency

R

ATOLL-Software

User Application

MPI PVMTCP/IP

Kernel Driver

ATOLL HW

ATOLL API ATOLLdaemon

•Controls Network Startup (clock distribution, routing)•Supervises NIC at runtime•Provides routing information

OpenSourceSW

Page 8: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 9

ATOLL

ATOm ic Lo w Latency

R

Future Development

Future of ATOLL Hardware-DevelopmentFuture of ATOLL Hardware-Development

• EXTOLL• 500 - 1000 MHz clock• higher dimensional Crossbar for multidimensional IN structures• multithreaded cached host interface• memory management support • command extension for direct memory operations (put, get, …) => MPI-2

Page 9: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 10

ATOLL

ATOm ic Lo w Latency

RChip Photo

Page 10: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 11

ATOLL

ATOm ic Lo w Latency

RChip Photo

Page 11: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 12

ATOLL

ATOm ic Lo w Latency

RATOLL Board

Page 12: University of Mannheim1 ATOLL ATOmic Low Latency – A high-perfomance, low cost SAN Patrick R. Haspel haspel@uni-mannheim.de Computer Architecture Group

University of Mannheim 13

ATOLL

ATOm ic Lo w Latency

RInterconnect