Upload
sharlene-ward
View
214
Download
0
Embed Size (px)
Citation preview
Hans Wenzel CMS week, CERN September 2002
”Facility for muon analysis at FNAL"
Hans Wenzel Fermilab
I. What is available at FNAL right nowII. What will be available after the upgradeIII. How to access Files in mass storage
(dCache,Enstore)IV. How to use the batch systemV. Near term plan
Hans Wenzel CMS week, CERN September 2002
IntroductionI. computing at the CMS Tier 1 center at FNAL
provides:
II. Monte Carlo Production (Trigger + physics TDR) in distributed environment.
III. Host and serve the data, Mass storage
IV. Provide computing and development platform for physicist (resources, code, disk, help, tutorials,.....)
V. Evaluate new hardware, software solutions
VI. Active development
Hans Wenzel CMS week, CERN September 2002
Our Web sites
I. Monitoring page, links to tools and scripts http://computing.fnal.gov/cms/Monitor/cms_production.html
II. Department web site: http://computing.fnal.gov/cms
III. The batch system: http://gyoza8.fnal.gov/cgi-bin/fbsng/fbswww/fbswww
IV. The dCache system: http://gyoza7.fnal.gov:443
Hans Wenzel CMS week, CERN September 2002
Obtaining a CMS account at FNAL
I. http://computing.fnal.gov/cms/ Then click on the "CMS Account" button that will guide you through the process
II. Step 1: Get a Valid Fermilab ID
III. Step 2: Get a fnalu account and CMS account
IV. Step 3: Get a Kerberos principal and krypto card
V. Step 4: Send me email to create an account on the CMS cluster and read Information for first-time CMS account users
Hans Wenzel CMS week, CERN September 2002
Help Mailing lists:
Mailing list archives: http://listserv.fnal.gov/
Webpages: http://www.fnal.gov/cd/
Hans Wenzel CMS week, CERN September 2002
What's available for the User at FNAL
I. Currently in the process of setting up and evaluating the best solution. The current situation is far from ideal. Some annoyance also caused by software distribution based on outdated Linux version.
II. linux servers: wonder, burrito, whopper, nfs cross mounted /data disks (DAS), FBSNG batch system (8 CPU’s) attached to whopper. Need to contact me to get kerberos principle matched to special batch principal
III. Cmsun1: 8 way sun smp machine
BIG
MA
C
R&D
1 TB1 TB
FR
Y
CMSUN1
WHOPPER
WONDER
BURITO
GA
LL
O
VE
LV
EE
TA
RA
ME
N
CHALUPA
CHOCOLAT
SNICKERS
PO
PC
RN
GY
OZ
A
Production Cluster
US-CMSTESTBED
USER ANALYSIS
ENSTORE(15 Drives)
250GB
250GB
1TB
750GB
250GB250GB
250GB
250GBE
SN
ET
(O
C3)
MR
EN
(O
C3)
CISCO6509
1TB1TB
BATCHBATCH
Hans Wenzel CMS week, CERN September 2002
IBM - servers
CMSUN1Dell -servers
Chocolat
snickers
Chimichanga
Chalupa
Winchester Raid
Hans Wenzel CMS week, CERN September 2002
Popcorns (MC production)
frys(user)
gyoza(test)
Hans Wenzel CMS week, CERN September 2002
Integrating the Desktop
I. Besides central computing make use of the powerful PC's running linux (plenty of disk, cpu, freedom..)
II. Created CMS desktop workgroup containing everything you need to run CMS software on your PC ( afs ....). CMS software is kept up to date in AFS. You can create your own objectivity database….
Hans Wenzel CMS week, CERN September 2002
Upgrades this year 40 more farm nodes --> 3x computing power >20 nodes for user computing But we will go away from dedicated farms instead
dynamically assign nodes as necessary 2-5 nodes for webservers, disk cache master nodes etc. 8-12 disk servers to be part of dCache. New fast disk system (Zambeel) Faster higher capacity tape drives stk 9940b Better connectivity between cms computing and e.g.
mass storage.
RA
ME
N
PO
PC
RN
GY
OZ
A
PRODUCTION CLUSTER
(80 Dual Nodes)
US-CMSTESTBED
ENSTORE(17 DRIVES)
250GB
ES
NE
T (
OC
12)
MR
EN
(O
C3)
CISCO6509
DCACHE (>14TB) USER ANALYSISB
IGM
AC
R&D
1 TB1 TB
FR
Y
Hans Wenzel CMS week, CERN September 2002
Hans Wenzel CMS week, CERN September 2002
EnstoreSTKENSilo
Snickers
RAID 1TB
IDE
AMD serverAMD/Enstore interface
User access to FNAL (Jets/Met, Muons coming) Objectivity data:
Network
Users in:Wisconsin CERN FNALTexas
Objects
> 10 TB
Now also working with disk cache
Hans Wenzel CMS week, CERN September 2002
User Federations and FNAL
http://computing.fnal.gov/cms/Monitor/cms_production.html
Hans Wenzel CMS week, CERN September 2002
Access to mass storage at FNAL (dCache and enstore)
We have two cooperating systems enstore and dCache.
Enstore: network attached tape allowing sequential access (optimized)
dCache: disk farm allowing for random access both systems use the same name space and
can be accessed via the pnfs pseudo filesystem e.g. ls /pnfs/cms lists the files.
to access enstore dosetup -q stken encp encp /pnfs/cms/production/Projects/.... .
Hans Wenzel CMS week, CERN September 2002
Access to mass storage at FNAL (dCache and enstore)
I. to access dCache dosetup dcap dccp /pnfs/cms/production/Projects/.... .
II. Preferred way is to access files in mass storage is dCache
Hans Wenzel CMS week, CERN September 2002
What do we expect from dCache? making a multi-terabyte server farm look like one coherent
and homogeneous storage system. Rate adaptation between the application and the tertiary
storage resources. Optimized usage of expensive tape robot systems and drives
by coordinated read and write requests. Use dccp command instead of encp!
No explicit staging is necessary to access the data (but prestaging possible and in some cases desirable).
The data access method is unique independent of where the data resides.
Hans Wenzel CMS week, CERN September 2002
What do we expect from dCache (continued)?
High performance and fault tolerant transport protocol between applications and data servers
Fault tolerant, no specialized servers which can cause severe downtime when crashing.
Can be accessed directly from your application (e.g. root TDCacheFile class).
Ams-dcache server has been developed by replacing the POSIX IO with dCache library but we found the ams server to be highly unstable. Not clear if we will continue.
Hans Wenzel CMS week, CERN September 2002
Random Access Sequential Access
FileTransfer
Production
PersonalAnalysis
Disk Cache
ENSTORE
(Hierarchical Storage Manager)
CMS-specific
CD/ISDCD/ISD
The current system consists of 5 x 1.2 TB (Linux) read pools and 1sun server +1/4 TB raid array as write pool. We have additional 2 servers for R&D and funding for more (>5).
Hans Wenzel CMS week, CERN September 2002
First results with dCache system• No optimization yet, the default configuration will be
upgraded kernel, the xfs FS and may be double Gbit connectivity. The tests include all overhead. The average file size is ~1 Gbyte the reads are equally distributed over all read pools.
# of concurrent reads (40 farm nodes)
Aggregate throughput (sustained over hours)
70 108 Mbyte/sec
60 104 Mbyte/sec
5 42.5 Mbyte/sec
Hans Wenzel CMS week, CERN September 2002
How to access data from root Setup e.g. Linux:
setenv ROOTSYS /afs/fnal.gov/files/code/cms/ROOT/3.03.08/i386_linux22/gcc-2.95.2
setenv LD_LIBRARY_PATH $\{ROOTSYS\}/lib:$LD_LIBRARY_PATH
setenv PATH $\{ROOTSYS\}/bin:$PATH
#include <TDCacheFile.h>
int main(int argc, char **argv)
{ static TROOT exclusivefit("main","B lifetime fitting");
static TRint app("app",&argc,argv,NULL,0);
// TDCacheFile hfile("/pnfs/cms/wenzel/hsimple_dcache.root","CREATE","Demo ROOT file with histograms",0);
TFile *hfile = new TDCacheFile ("dcap://stkendca3a.fnal.gov:24125/pnfs/fnal.gov/usr/cms/wenzel/hsimple_dcache.root","READ","Demo ROOT file with histograms",0);
hfile->ls();
hfile->Print();
\
Hans Wenzel CMS week, CERN September 2002
Using the batch system
Batch system we use is FBSNG which has been especially developed for farms.
Whopper and fry 5-7
But it is a farm batch system so needs getting used to and kerberos principals necessary if you want to do anything useful. So contact me before trying to use the system
http://gyoza8.fnal.gov/cgi-bin/fbsng/fbswww/fbswww
Hans Wenzel CMS week, CERN September 2002
Fbsng (cont.)
Setup FBSNG
Create job description file:
SECTION main
EXEC=/afs/fnal.gov/files/home/room2/cmsprod/wenzel/test.csh
QUEUE=CMS
STDOUT=/data/fbs-logs/
STDERR=/data/fbs-logs/
Hans Wenzel CMS week, CERN September 2002
#!/usr/local/bin/tcsh -f
echo $FBS_SCRATCH
cd $FBS_SCRATCH
source /usr/local/etc/setups.csh
setenv PATH /bin:/usr/bin:/usr/local/bin:/usr/krb5/bin:/usr/afsws/bin
/bin/date
/bin/cat > stupid.file << "EOF"
!
! this is just a stupid example file
!
"EOF"
/bin/ls
/bin/pwd
cp ./stupid.file /data/fbs-logs
Hans Wenzel CMS week, CERN September 2002
FBSNG (cont.)
Fbs submit test.jdf
Fbs status
Hans Wenzel CMS week, CERN September 2002
Near term plan (user) I. Hardware coming (late, tomorrow?)which needs to
be installed and the farm nodes will go through a one month acceptance period.
II. make the user batch system easy to use. GUI to create jobs from templates and to submit jobs. This is basically done.
III. Make farm usable for interactive use. Similar to lxplus cluster at cern. Currently we are investigating two solutions to achieve load balanced login: LVS and FBSNG. Need a solution for home areas and shared data areas. Here we will investigate systems from zambeel and panasys.