CEDPS: Center for Enabling Distributed Petascale Science Brian Tierney Lawrence Berkeley National...

Preview:

Citation preview

CEDPS:Center for Enabling

Distributed Petascale Science

Brian TierneyLawrence Berkeley National

Laboratoryhttp://www.cedps-scidac.org/

2

CEDPS in a Nutshell

• Center for Enabling Distributed Petascale Science• CEDPS – “seds” (silent P)• DOE SciDAC Center for Enabling Technology• July 1, 2006 – June 30, 2011, $2.4M/yr

• Collaboration between 5 sites• Argonne National Laboratory• Fermi National Laboratory• Lawrence Berkeley National Laboratory• USC Information Sciences Institute• University of Wisconsin Madison

• Three focus areas• Moving data to compute resources• Moving compute services to data sites• Troubleshooting and diagnosis tools

3

The Petascale Data Challenge

• DOE facilities generatemany petabytes of data(2 petabytes = all U. S. academic research libraries!)

Massive data

U

U

U

U

U

DOEfacilities

• Remote users (at labs universities, industry) need data!

• Rapid, reliable accesskey to maximizingvalue of $B facilities

U

Remotedistributed users

UU

4

• Reliable: recoverfrom many failures

• Predictable: data arrives when scheduled

• Secure: protect expensive resources & data• Scalable: deal with many

users & much data

Bridging the Divide (1):Move Data to Users

When & Where Needed

C

B

A

• Fast: >10,000x faster thanusual Internet

“Deliver this 100 Terabytes to

locations A, B, C by 9am tomorrow”

5

• Flexible: easyintegration of functions

• Secure: protect expensive resources & data• Scalable: deal with many

users & much data

Bridging the Divide (2):Allow Users to Move

ComputationNear Data

A

• Science services:provide analysisfunctions neardata source

“Perform mycomputation F ondatasets X, Y, Z”

Y Z

X F

6

• Instrument: includemonitoring points inall system components

• Monitor: collect data inresponse to problems

• Diagnose: identify thesource of problems

Bridging the Divide (3):Troubleshoot

End-to-EndProblemsC

B

A“Why did my datatransfer (or remoteoperation) fail?”

• Identify & diagnose failures & performanceproblems

CEDPS Troubleshooting Work

8

The Troubleshooting Problem

• Large production Grids (OSG, TeraGrid, etc.) report a high failure rate• 10-30% of jobs submitted to the Grid fail

• mostly authentication errors and disk space problems• Users don’t always notice, as jobs may be automatically

resubmitted and may succeed the next time• Troubleshooting in this environment is very difficult

• Current Approach• Log into all hosts used (if possible)• ‘grep’ various log files looking for problems

• Inconsistent logging levels • Multiple file formats

• Often a tedious and time consuming problem

9

Our Approach

• Discover broken services before users do • Deploy monitoring software that can perform

alerts on errors • Run test jobs to detect problems

• When need to do log analysis• Use a unified approach to logging for

applications and middleware • Collect logs centrally• Develop automatic analysis tools for anomaly

detection

10

Logging Solution: “NetLogger” Methodology

• NetLogger (short for “Networked Application Logger”)• Methodology for troubleshooting distributed applications• Troubleshooting = detection and analysis of faults and

performance issues• Key Components:

• common log format: name=value pairs• http://www.cedps.net/wiki/index.php/LoggingBestPractices

• precision ISO-format timestamp and synchronized clocks (via NTP)• wrap all “interesting” actions with ‘start’ and ‘end’ log events

• interesting = anything that might fail or run slow• use unique event names

• e.g.: org. event=org.globus.gridFTP.authn.start • include “event ID” in each log message

• allows correlation of sets of events• collect logs in a centralized database

11

Example Log: GridFTP

ts=2006-12-08T18:39:23.114369Z event=org.globus.gridFTP.start prog=GridFTP-4.0.3 localhost=myhost remoteHost=somehost.gov:56010 serverMode=inetd guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3

ts=2006-12-08T18:39:23.114567Z event=org.globus.gridFTP.authn.start DN=“/DC=org/DC=doegrids/OU=People/CN=Somebody” guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3

ts=2006-12-08T18:39:25.514369Z event=org.globus.gridFTP.authn.end DN=“/DC=org/DC=doegrids/OU=People/CN=Somebody” msg=“123456 successfully authorized” localUser=uscmspool381 guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3 status=0

ts=2006-12-08T18:39:25.864369Z event=org.globus.gridFTP.transfer.start file=/tmp/myfile tcpBufferSize=128KB dataBlockSize=262144 numStreams=1 numStripes=1 destHost=129.79.4.64 guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3

ts=2006-12-08T18:45:02.214369Z event=org.globus.gridFTP.transfer.end file=/tmp/myfile bytesTransferred=678433 guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3 status=0

ts=2006-12-08T18:45:02.214386Z event=org.globus.gridFTP.end guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3 status=226

12

• No need to invent something new for this• syslog-ng tool fills all requirements

• Open source, runs on all major OSes• Fault tolerant, secure (via stunnel), scalable, easy to

configure, etc.• Large user base • Can filter logs based on level and content • Arbitrary number of sources and destinations• Can act as a proxy, tunnel thru firewalls• Execute programs: Send email, load database, etc.• Built-in log rotation• Timezone support and Fully qualified host names

• http://www.balabit.com/products/syslog-ng/

Log Collection

13

Log collection using syslog-ng

14

Sample Site Deployment For Grid Middleware

15

Troubleshooting Architecture: Site Archive

Log Summarization Library: New Component of the

NetLogger Toolkit http://dsd.lbl.gov/NetLogger/

17

NetLogger Toolkit

• NetLogger Toolkit• Client library that to make it easy to generate logs

• C, Java, Perl, Python• First released in 1994• Summarizer new in NetLogger 4.0 release, Nov. 2007

• Open Source • available at http://dsd.lbl.gov/NetLogger/

• Visualization Support• scripts to convert logs to R, gnuplot, Excel format• collection of sample R and gnuplot scripts

• Database tools• tools to load logs into mySQL

18

New NetLogger Feature: Time-based summarization

DO for i=1, N log(“read.start”, ...) read_data() log(“read.end”, ...)

process_data()

log(“write.start”, ...) write_data() log(“write.end”, ...)DONE

Example: you want detailed I/O instrumentation of a simple loop that reads data, processes it, then writes out some result.

19

Go from here..

Time-based summarization

• Full logging can produce many gigabytes of logs!• Summarized Logs report only a periodic summary (mean, sd) of times and

values between “start” and “end” events (of a given type and ID).• Orders of magnitude data reduction with, for many purposes, no loss in

explanatory power.• Since the original instrumentation is still there, you can “turn on” and off the full

detail in a running program.

20

Sample Use: GridFTP Bottleneck Detector

• GridFTP today• Only logs single throughput number• Throughput might be limited by

• Sending/receiving disk • NFS/AFS mounted partition?

• Network• Currently no way to tell

• New GridFTP enhancement• Use log summarization library to monitor all I/O

streams

21

Full vs Summary Results

Bottleneck = disk read

22

More Information

• General CEDPS information:• http://www.cedps.net

• Log Summarizer: • http://dsd.lbl/gov/NetLogger

• New GridFTP with NetLogger Log Summarizer:• http://www.cedps.net/wiki/index.php/Gridftp-netlogger

• Contact us if you need troubleshooting help!• email: BLTierney@lbl.gov, DKGunter@lbl.gov

Extra Slides

24

NetLogger Summarizer vs ‘R’ summarized logs

25

Logging “Best Practices” Recommendations

• Practices• All logs should contain a unique event name and an ISO-

format timestamp• All system operations that might fail or experience

performance variations should be wrapped with start and end events.

• All logs from a given execution thread should be tagged with a globally unique ID (or GUID), such as a Universal Unique Identifiers (UUIDs)

• Log format• Logs should be composed of lines of ASCII name=value

pairs• Example: ts=2006-12-08T18:48:27.598448Z

event=org.globus.gridFTP.transfer.start prog=GridFTP-v4.2 guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3

file=filename src.host=H1 src.port=P1 dst.host=H2 dst.port=P2

http://www.cedps.net/wiki/index.php/LoggingBestPractice

26

Event Names

• Use a '.' as a separator and go from general to specific • Same as Java class names

• First part of name should be used as a unique namespace (e.g.: org.globus)

• Use start/end suffixes whenever possible• Helps immensely with troubleshooting

• Examples• org.globus.gridFTP.start

• org.globus.gridFTP.authn.start

• org.globus.gridFTP.authn.end

• org.globus.gridFTP.transfer.start

• org.globus.gridFTP.transfer.end

• org.globus.gridFTP.end

–org.globus.MDS.response.start

–org.globus.MDS.query.start

–org.globus.MDS.query.end

–org.globus.MDS.write.net.start

–org.globus.MDS.write.net.end

–org.globus.MDS.response.end

27

Globally Unique IDs

• Use the ‘guid’ or ‘id’ reserved name to allow correlation of a set of events together• event=org.globus.gridFTP.authn.start id=27023• event=org.globus.gridFTP.authn.end id=27023• event=org.globus.gridFTP.transfer.start id=27023• event=org.globus.gridFTP.transfer.end id=27023

• Can use standard unix/windows program ‘uuidgen’ to generate globally unique ID• e.g.: A5A563CD-D80C-4E58-9ECD-79C6B611E122

28

GridFTP: network vs disk performance

29

Syslog-ng Deployment for OSG