Transcript
Page 1: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

Farm Completion

Beat Jost and Niko NeufeldLHCb Week St. Petersburg

June 2010

Page 2: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

Filling the farm

• Thanks for interesting and useful discussions to– Loic Barda, Rolf Lindner, Laurent Roy

and Eric Thomas

• Thanks for measurements and plots to – Juan Caicedo and Patrick Robbe

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 2

Page 3: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

The three limits:Power, Cooling, Money

• Power: 550 kW available (105 kW used)

• Cooling: nominally available 525 kW• Rack-space: 1700 Us (plenty)• Money: xx MCHF

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 3

Page 4: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

Event Filter Farm

• Level 1: – 100 SuperMicro Twin servers (2 servers

in a single 1U chassis with shared power-supply), Intel Harpertown CPU 5420 (2.5 GHz) 4 cores / socket, 1 GB RAM /core

• Level 2:– 350 DELL Bladeservers (up to 16 blades

in a 10 U chassis), Intel Harpertown CPU 5420 (2.5 GHz) 4 cores / socket, 2 GB RAM /core

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 4

Page 5: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

The new farm-node

• Both Intel and AMD have brought out new processors: with up to 12 cores / chip and (Intel) hyper-threads (a.k.a. virtual CPUs)

• Memory has (again) become faster and cheaper (DDR-3) and each processor has 3 memory channels ( “good” memory configuration = 3 * n, where n = 2, 4, 8, 16

• Both processors are now NUMA (non-uniform memory access)– Study program ongoing to take profit from this

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 5

Page 6: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

How many jobs / server

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 6

Page 7: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

How fast?

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 7

Page 8: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

Server specifications

• 1 GB RAM per hardware thread == virtual core

• 1 Power supply failure should not affect more than 2 units

• 2 Gigabit Ethernet ports• No constraints on power-consumption• CPU (AMD 61xx / Intel 56xx) chosen

such as to optimise the Moore/CHF

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 8

Page 9: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

A likely candidate

• 1.2 kW– redundant PS

• 4 servers with each– 12 cores – 24 GB (up to 96)

RAM– 1 HDD– 2 x Gigabit Ethernet

• 21 kCHF list-price

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 9

Page 10: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

Conclusions

• We will run with 16 Moore jobs / server (twice as many as today)

• Each server will be 2 to 2.5 x faster than the current HLT node

• Each Moore instance can use up to 1.5 GB RAM– If really need more RAM

1. Reduce number of jobs2. Increase (double) memory

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 10

Page 11: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

Procedure / planning

Step Duration

Decision to buy (day X) 0

Technical specifications to firms 1 week

Firms reply (with offer) / validation of sample server

4 weeks

Adjudication (negotiation) 1 week

Delivery (in batches if possible installation starts as soon as delivered)

6 weeks

Finishing installation 1 week

Farm Level 3 in production 13 weeks after initial decision

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 11

Page 12: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

To-do list

Hardware• Unpacking (surface

SX8 need a lot of space and friendly volunteers)

• Installation in D1– Power, network

• Burn-in (3 days)• Exchange faulty

servers / parts

Software• Install OS, verify

OS tuning (NIC, memory arrangement etc…)

• Integrate in software-management (Quattor)

• Add to farm-control

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 12

Page 13: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

DETAILS

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 13

Page 14: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

Farm Completion St. Petersburg 06/2010 - Niko Neufeld 14

Page 15: Farm Completion Beat Jost and Niko Neufeld LHCb Week St. Petersburg June 2010

How fast? (Moore v9r2 HLT1 only)

DAQ & electronics upgrade - Niko Neufeld 15


Recommended