Systems in AMS02
AMS TIM @CERN , July 2003
Computing and Ground
MIT
Alexei Klimentov — [email protected]
2Alexei Klimentov. AMS TIM. July 2003.
Outline
AMS02 Data Flow AMS02 Ground Centers Science Operation Center Architecture
choice of HW, cost estimation,
implementation plan Data Transmission SW TReK SW
3Alexei Klimentov. AMS TIM. July 2003.
AMS Science Operations CenterAMS Science Operations Center
ISS to Remote AMS Centers Data Flow
Buffereddata
ACOP
High Rate Frame
MUX
AMS Payload Operations Control CenterAMS Payload Operations Control Center
PO
ICP
OIC
AMS GSCAMS GSC
Monitoring &Science data
Real-Time &“Dump” data
Stored data
RealTime,“Dump” &WhiteSands LOR playback
Ext
erna
l Com
mun
icat
ions
Short-Term
Long-Term
Payload Data Service SystemPayload Data Service System
Real-Time H&S data
Marshall Space Flight Center, ALMarshall Space Flight Center, AL
Real-Time &“Dump” data
NearReal-Time &“Dump” data
FileTransfer
playback
Payload Operation & Integration Center
Payload Operation & Integration Center
NASA’s Ground InfrastructureISS
,
,
,
AMS Regional CentersAMS Regional Centers
FileTransfer
White Sand, NM Facility
Event reconstruction, batch & interactivePhysics analysis
Data archiving
Event reconstruction, batch & interactivePhysics analysis
Data archiving
Commanding,Monitoring,
Online analysis
Commanding,Monitoring,
Online analysis
Buffering beforetransmission
Buffering beforetransmission
4Alexei Klimentov. AMS TIM. July 2003.
AMS Ground Centers (Ground Support Computers)
At Marshall Space Flight Center (MSFC), Huntsville Al
Receives monitoring and science data from NASA Payload Operation and Integration Center (POIC)
Buffers data until retransmission to the AMS Science Operation Center (SOC) and if necessary to AMS Payload Operations and Control Center (POCC)
Runs unattended 24h/day, 7 days/week Must buffer about 600 GB (data for 2 weeks)
5Alexei Klimentov. AMS TIM. July 2003.
AMS Ground Centers
(Payload Operation and Control Center) AMS02 “Counting Room” Usual source of AMS commands Receives H&S, monitoring, science and NASA data in real-time mode Monitor the detector state and performance Process about 10% of data in near real time mode to provide fast information
to the shift taker Video distribution “box” Voice loops with NASA Computing Facilities Primary and backup commanding stations
Detector and subdetectors monitoring stations Stations for event display and subdetectors status displays Linux servers for online data processing and validation
6Alexei Klimentov. AMS TIM. July 2003.
AMS Ground Centers (Science Operation Center)
Receives the complete copy of ALL data Data reconstruction, calibration, alignment and processing,
generates event summary data and does event classification Science analysis Archive and record ALL raw, reconstructed and H&S data Data distribution to AMS Universities and Laboratories
7Alexei Klimentov. AMS TIM. July 2003.
(Regional Centers)
Analysis facility to support physicists from geographically closed AMS Universities and Laboratories;
Monte-Carlo Production; Provide access to SOC data storage (event
visualisation, detector and data production status, samples of data , video distribution);
Mirroring AMS DST/ESD.
AMS Ground Centers
8Alexei Klimentov. AMS TIM. July 2003.
AMS Data Volume (Tbytes)
Data/
Year1998 2001 2002 2003 2004 2005 2006 2007 2008 2009 Total
Raw 0.20 ---- 5.5 5.0 5.0 2.5 15 15 15 0.5 ~64
ESD 0.30 --- 3.5 0.5 0.5 7.5 44 44 44 1.5 ~146
Tags 0.05 --- --- --- --- 0.2 0.6 0.6 0.6 0.1 2.0
Total 0.55 --- 9 5.5 5.5 10.2 59.6 59.6 59.6 2.1 ~212
MC 0.11 1.7 4.0 8.0 8.0 8.0 44 44 44 44 ~206
Grand
Total0.66 1.7 13.0 13.5 13.5 18.2 104 104 104 46.1 ~420
STS91 ISS
9Alexei Klimentov. AMS TIM. July 2003.
Symmetric MultiProcessor Model
Experiment
TapeStorage TeraBytes of disks
10Alexei Klimentov. AMS TIM. July 2003.
Scalable model
Disk & TapeStorage TeraBytes of disks
11Alexei Klimentov. AMS TIM. July 2003.
AMS02 Benchmarks
Executive time of AMS “standard” job compare to CPU clock
1) V.Choutko, A.Klimentov AMS note 2001-11-01
1)
Brand, CPU , Memory
Intel PII dual-CPU 450 MHz, 512 MB RAM
OS/Compiler
RH Linux 6.2 / gcc 2.95
“Sim”
1
“Rec”
1
Intel PIII dual-CPU 933 MHz, 512 MB RAM RH Linux 6.2 / gcc 2.95 0.54 0.54
Compaq, Quad α-ev67 600 MHz, 2 GB RAM RH Linux 6.2 / gcc 2.95 0.58 0.59
AMD Athlon,1.2GHz, 256 MB RAM RH Linux 6.2 / gcc 2.95 0.39 0.34
Intel Pentium IV 1.5GHz, 256 MB RAM RH Linux 6.2 / gcc 2.95 0.44 0.58
Compaq dual-CPU PIV Xeon 1.7GHz, 2GB RAM RH Linux 6.2 / gcc 2.95 0.32 0.39
Compaq dual α-ev68 866MHz, 2GB RAM Tru64 Unix/ cxx 6.2 0.23 0.25
Elonex Intel dual-CPU PIV Xeon 2GHz, 1GB RAM RH Linux 7.2 / gcc 2.95 0.29 0.35
AMD Athlon 1800MP, dual-CPU 1.53GHz, 1GB RAM RH Linux 7.2 / gcc 2.95 0.24 0.23
8 CPU SUN-Fire-880, 750MHz, 8GB RAM Solaris 5.8/C++ 5.2 0.52 0.45
24 CPU Sun Ultrasparc-III+, 900MHz, 96GB RAM RH Linux 6.2 / gcc 2.95 0.43 0.39
Compaq α-ev68 dual 866MHz, 2GB RAM RH Linux 7.1 / gcc 2.95 0.22 0.23
12Alexei Klimentov. AMS TIM. July 2003.
AMS SOC (Data Production requirements)
Reliability – High (24h/day, 7days/week) Performance goal – process data “quasi-online” (with typical delay < 1 day) Disk Space – 12 months data “online” Minimal human intervention (automatic data handling, job control and book-keeping) System stability – months Scalability Price/Performance
Complex system that consists of computing components including I/O nodes, worker nodes, data storage and networking switches. It should perform as a single system.Requirements :
13Alexei Klimentov. AMS TIM. July 2003.
AMS Science Center Computing Facilities
CERN/AMS Network
AMS Physics Services
N
Central Data Services
Shared Disk Servers
25 TeraByte disk6 PC based servers
25 TeraByte disk6 PC based servers
tape robotstape drivesLTO, DLT
tape robotstape drivesLTO, DLT
Shared Tape Servers
Homedirectories& registry
consoles&
monitors
Production Facilities,40-50 Linux dual-CPU
computers
Linux, Intel and AMD Linux, Intel and AMD
EngineeringCluster
EngineeringCluster
5 dual processor PCs5 dual processor PCs
Data Servers,
Analysis Facilities(linux cluster)
10-20 dual processor PCs10-20 dual processor PCs
5 PC servers
AMS regional Centers
batchdata
processing
batchdata
processing
interactivephysicsanalysis
Interactive and Batch physicsanalysis
14Alexei Klimentov. AMS TIM. July 2003.
AMS Computing facilities (disks and cpus projected characteristics)
Components 1998 2002 2006
Intel/AMD PC
Dual-CPU Intel PII,
rated at 450 MHz,
512 MB RAM.
7.5 kUS$
Dual-CPU Intel,
Rated at 2.2 GHz,
1GB RAM and
RAID controller
7 kUS$
Dual-CPU
rated at 8GHz,
2GB RAM and
RAID controller
7 kUS$
Magnetic disk 18 GByte SCSI
80 US$/Gbyte
SG 180 GByte SCSI
10 US$/Gbyte
WD 200 GByte IDE
2 US$/Gbyte
700 Gbyte
1 US$/Gbyte
Magnetic tape DLT 40 GB
3 US$/Gbyte
SDLT and LTO
200 GB
0.8 US$/Gbyte
?
400 GB
0.3 US$/Gbyte
15Alexei Klimentov. AMS TIM. July 2003.
AMS02 Computing Facilities Y2000-2005 (cost estimate)
Function Computer Qty Disks (Tbytes)Cost kUS$
GSC@MSFC HP, Sun, Intel, dual-CPU, 1.5+GHz 22x1TB
Raid-Array55
POCCx2Intel and AMD, dual-CPU, 2.4+GHz 20 1TB Raid-Array 150
Production Farm Intel and AMD, dual-CPU , 2.4+GHz 50 10 TB Raid-Array 350
Database Servers dual-CPU 2.0+ GHz Intel or Sun SMP 2 0.5TB 50
Event Storage and Archiving
DiskServers dual-CPU Intel 2.0+Ghz 6 25 Tbyte RaidArray 200
Interactive and Batch Analysis
SMP computer, 4GB RAM, 300 Specint95 or Linux farm
2/10 1 Tbyte Raid Array 55
Sub. Total 860
Running Cost 150
Grand Total 1010
16Alexei Klimentov. AMS TIM. July 2003.
AMS Computing facilities (implementation plan)
Q1-2 2003
End 2003
Choice of server and processing node architecture, setup 10% prototype of AMS production farm. Evaluation of archiving system
40% prototype of AMS production farm
Q1-2 2004
End 2004
Evaluation SMP vs distribudet computing , finalize the architecture of GSC @MSFC
60% prototype of AMS production farm, purchase and setup final configuration GSC@MSFC, make choice of “analysis” computer, archiving and storage system
Q1-2 2002
Q1-4 2002
AMS GSC prototype installation @MSFC Al , data transmission tests tests between MSFC and CERN, MSFC and MIT
Disk server and processor architecture evaluation
Beg 2005
Mid 2005
End 2005
Purchase disks to setup dsik pool, purchase POCC computers.
Purchase “analysis computer” , setup production farm in final configuration
Setup final configuration of production farm and analysis computer
17Alexei Klimentov. AMS TIM. July 2003.
CERN’s Network Connections
CERN
RENATER
C-IXP
IN2P3
TEN-155
KPNQwest (US)
SWITCH
39/155 Mb/s
155Mb/s2Mb/s1Gb/
s
2x255Mb/s1Gb/s
National Research Networks
Mission Oriented Link
Public
Commercial
WHO TEN-155: Trans-European Network at 155Mb/s
45Mb/s
18Alexei Klimentov. AMS TIM. July 2003.
CERN’s Network Traffic
CERN 40 Mb/s Out38 Mb/s In
KPNQwest(US)
RENATER
TEN-155
IN2P3
SWITCH
100Mb/s2Mb/s
2x255Mb/s
40Mb/s
45Mb/s
Link Bandwidth
5.2Mb/s
20Mb/s
4.7Mb/s
14Mb/s
0.1Mb/s
5.5Mb/s
25Mb/s
0.1Mb/s
CERN : ~36 TB/month in/outAMS Raw data 0.66 TB/month = 2 Mb/s
1Mb/s = 11GB/day
Incoming data rate
Outgoing data rate
19Alexei Klimentov. AMS TIM. July 2003.
Data Transmission
Will AMS need a dedicated line to send data from MSFC to ground centers or the public Internet can be used ?
What Software (SW) must be used for a bulk data transfer and how reliable is it ?
What data transfer performance can be achieved ?
High Rate Data Transfer between MSFC Al and POCC/SOC, POCC and SOC, SOC and Regional centers
will become a paramount importance
20Alexei Klimentov. AMS TIM. July 2003.
Data Transmission SWWhy not FileTransferProtocol (ftp) or ncftp , etc ? to speed up data transfer to encrypt sensitive data and not encrypt bulk data to run in batch mode with automatic retry in case of failure … starting to look around and came up with bbftp in September 2001 (still looking for a good
network monitoring tools) (bbftp developed in BaBar and used to transmit data from SLAC to IN2P3@Lyon) adapted it for AMS, wrote
service and control programs
1)
1) A.Elin, A.Klimentov AMS note 2001-11-022) P.Fisher, A.Klimentov AMS Note 2001-05-02
21Alexei Klimentov. AMS TIM. July 2003.
Data Transmission SW (tests)
source destinationTest Duration (hours)
Nominal Bandwidth
(Mbit/sec)
Iperf (Mbit/sec)
Bbftp (Mbit/sec)
CERN I CERN II 24 10 10 7.8
CERN I CERN II 24 100 100 66.4
CERN II MIT 12x3 100[255]1000 26 24.6
CERN II MSFC Al 24x2 100[255]100 16 9.5
MSFC Al CERN II 24x2 100[255]100 16 9.5
22Alexei Klimentov. AMS TIM. July 2003.
Data Transmission Tests (conclusions)
In its current configuration Internet provides sufficient bandwidth to transmit AMS data from MSFC Al to AMS ground centers at rate approaching 9.5 Mbit/sec
bbftp is able to transfer and store data on a high end PC reliably with no data loss
bbftp performance is comparable of what achieved with network monitoring tools
bbftp can be used to transmit data simultaneously to multiple cites