66
> Solving Big problems with OS: Condor > Antonio Sanz ([email protected]) > 09 / Nov / 11

Solving BIG problems with Open Source: Condor

Embed Size (px)

DESCRIPTION

This is an introduction to the queue distribution system Condor and how are we using it at the I3A (http://i3a.unizar.es)

Citation preview

Page 1: Solving BIG problems with Open Source: Condor

> Solving Big problems with OS: Condor

> Antonio Sanz ([email protected]) > 09 / Nov / 11

Page 2: Solving BIG problems with Open Source: Condor

2

Page 3: Solving BIG problems with Open Source: Condor

3

> Antonio Sanz

> I3A System Manager

> HERMES HPC cluster sysadmin

> [email protected]

> @antoniosanzalc

Page 4: Solving BIG problems with Open Source: Condor

4

Page 5: Solving BIG problems with Open Source: Condor

5

Show / Know / Use

Page 6: Solving BIG problems with Open Source: Condor

6Problema inicial

3. Sistemas de gestión de colas : Condor

> Dr. Good

> Neurologist

> Alzheimer research

> Process 20000

brain image scans

(1h/image)

> A thousand times. Maybe two.

Page 7: Solving BIG problems with Open Source: Condor

7Problema inicial > Mrs. Nice

> Santa’s Logistic Officer

> Gift transportation

> Analize 6x10e7 possible load/reindeers/routes

(10min/analysis)

> Before Christmas!

Page 8: Solving BIG problems with Open Source: Condor

8

Hey … ! It’s a 64K one !

Page 9: Solving BIG problems with Open Source: Condor

9

Queue distribution systems

Page 10: Solving BIG problems with Open Source: Condor

10Condor Basics

3. Sistemas de gestión de colas : Condor

Single queue

Page 11: Solving BIG problems with Open Source: Condor

11

Page 12: Solving BIG problems with Open Source: Condor

12Condor Basics

Multiple queues

3. Sistemas de gestión de colas : Condor

Page 13: Solving BIG problems with Open Source: Condor

13

Page 14: Solving BIG problems with Open Source: Condor

14

Problem partitioning

Page 15: Solving BIG problems with Open Source: Condor

15Problem can be broken into independent pieces

Page 16: Solving BIG problems with Open Source: Condor

16Condor Basics

Oh Yeah!

Page 17: Solving BIG problems with Open Source: Condor

17Condor Basics

For loops are your best friends

3. Sistemas de gestión de colas : Condor

Page 18: Solving BIG problems with Open Source: Condor

18Condor Basics

3. Sistemas de gestión de colas : Condor

While loops …can sometimes be convinced

Page 19: Solving BIG problems with Open Source: Condor

19Condor Basics

Do it yourself !3. Sistemas de gestión de colas : Condor

Page 20: Solving BIG problems with Open Source: Condor

20Condor Basics

>

Page 21: Solving BIG problems with Open Source: Condor

21Condor Basics

Heterogeneous computing

Page 22: Solving BIG problems with Open Source: Condor

22

Resource harvesting

Page 23: Solving BIG problems with Open Source: Condor

23

Requirements

Page 24: Solving BIG problems with Open Source: Condor

24

Job Surveillance

Page 25: Solving BIG problems with Open Source: Condor

25Condor Basics

Fair use of resources

3. Sistemas de gestión de colas : Condor

Page 26: Solving BIG problems with Open Source: Condor

26

Checkpoints

Page 27: Solving BIG problems with Open Source: Condor

27Condor Basics

Nested jobs (DAG)

Page 28: Solving BIG problems with Open Source: Condor

28Condor Basics

Email Notifications

Page 29: Solving BIG problems with Open Source: Condor

29

Grid & Cloud Computing

Page 30: Solving BIG problems with Open Source: Condor

30Condor Basics

Flexibility

Page 31: Solving BIG problems with Open Source: Condor

31

… with Hadoop, MPI, OpenMP, GPU

Page 32: Solving BIG problems with Open Source: Condor

32Condor Basics

3. Sistemas de gestión de colas : Condor

Page 33: Solving BIG problems with Open Source: Condor

33

How Condor works

Page 34: Solving BIG problems with Open Source: Condor

34

Management

[Hello, Dave]

Page 35: Solving BIG problems with Open Source: Condor

35

Compute

Page 36: Solving BIG problems with Open Source: Condor

36Condor Basics

Job list � ClassAd

3. Sistemas de gestión de colas : Condor

Page 37: Solving BIG problems with Open Source: Condor

37

Resource list � ClassAd

Page 38: Solving BIG problems with Open Source: Condor

38

Matchmaking

Page 39: Solving BIG problems with Open Source: Condor

39Condor Basics

Priority Management

Page 40: Solving BIG problems with Open Source: Condor

40

Data

Transfer

Page 41: Solving BIG problems with Open Source: Condor

41Condor Basics

3. Sistemas de gestión de colas : Condor

Job running

Page 42: Solving BIG problems with Open Source: Condor

42

Job Monitoring

Page 43: Solving BIG problems with Open Source: Condor

43

Job End

Page 44: Solving BIG problems with Open Source: Condor

44

Example

Page 45: Solving BIG problems with Open Source: Condor

45

Hello, World !!

#!/bin/sh# I’m hola.shecho Hola mundo desde `hostname`

# # A Hello World .. In Condor!# # I’m hello.subUniverse = vanilla Executable = hola.shLog = hola.logOutput = hola.outError = hola.errQueue

Page 46: Solving BIG problems with Open Source: Condor

46Lanzar el cálculo

condor_submit

4. Condor Basics – Un cálculo fácil

Page 47: Solving BIG problems with Open Source: Condor

47Lanzar el cálculo

condor_q

Page 48: Solving BIG problems with Open Source: Condor

48

Something tastier…#!/bin/sh# I’m hello2.shOUTPUT=hello${1}.resultcat hello.input >> $OUTPUT cat echo Hello world, I’m job $1 here from

`hostname` > $OUTPUT

# Execute n times with different outputsUniverse = vanillaExecutable = hello2.shTransfer_input_files = hello.inputWhenToTransferOutput = ON_EXIT_OR_EVICTArguments = $(Process)Log = hello.logOutput = hello.outQueue 10

Page 49: Solving BIG problems with Open Source: Condor

49

Perfect Simulation

4. Condor Basics – Un cálculo fácil

Page 50: Solving BIG problems with Open Source: Condor

50

Extra Bonus

Page 51: Solving BIG problems with Open Source: Condor

51

Dynamic Partitioning

Page 52: Solving BIG problems with Open Source: Condor

52Condor Basics

Configurable Jobs

Page 53: Solving BIG problems with Open Source: Condor

53

Advanced Accounting

Page 54: Solving BIG problems with Open Source: Condor

54

Dynamic Checkpointing

Page 55: Solving BIG problems with Open Source: Condor

55Condor Basics

Hadoop Integration

3. Sistemas de gestión de colas : Condor

Page 56: Solving BIG problems with Open Source: Condor

56

Green Computing

Page 57: Solving BIG problems with Open Source: Condor

57Condor Basics

GPU Integration

Page 58: Solving BIG problems with Open Source: Condor

58

I3A & Condor

Page 59: Solving BIG problems with Open Source: Condor

59

Gaming IA

Page 60: Solving BIG problems with Open Source: Condor

60

MRI

Brain

Analysis

Page 61: Solving BIG problems with Open Source: Condor

61Communication

Systems

Page 62: Solving BIG problems with Open Source: Condor

62

Tissue Modelling

Page 63: Solving BIG problems with Open Source: Condor

63Condor Basics

3. Sistemas de gestión de colas : Condor

Page 64: Solving BIG problems with Open Source: Condor

64Condor Basics

> Conclusiones

3. Sistemas de gestión de colas : Condor

Page 65: Solving BIG problems with Open Source: Condor

65

Example

Page 66: Solving BIG problems with Open Source: Condor

66

Antonio Sanz

[email protected]

@antoniosanzalc

Slides here:

http://web.hermes.cps.unizar.es/doc/condor.pdf