29
GridSuperscalar A programming model for GRID applications José Mª Cela [email protected] [email protected]

GridSuperscalar A programming model for GRID applications José Mª Cela [email protected] [email protected]

  • View
    219

  • Download
    4

Embed Size (px)

Citation preview

Page 1: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalarA programming model for GRID applications

José Mª Cela

[email protected]@ciri.upc.es

Page 2: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

What is GRID?

Conected heterogeneous resources

ACCampus

Castelldefels

LCFIBCampus

NordCEPBA

CampusNord

What we have?

500GFlops40PB disk space

10TB RAM...

Transparent access to basic resources

What we want?

Page 3: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

Basic brick in a GRID

SERVICES LIBRARIES

•JAVA•C (Unix)•TCP/IP

•OPEN CODE•DOCUMENTATION

UTILTIES

• Globus Toolkit 2.x

Page 4: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

What is GLOBUS?

Globus Resource Allocation Manager (GRAM)

Grid Security Infrastructure (GSI)

Monitoring and DiscoveryService (MDS)

Global Access to Secondary Storage (GASS)

Manage and executeprocesses

Secure Access Access to syteminformation

Data transfer

CLIENT

SERVER

LIBRARIES

UTILITIES

Page 5: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

Security

Autentification

Resource

GA

TEK

EEP

ER

GRID User -> Local User

GRID-MAPFILE

Autorization

PublicKey

PrivateKey

Resou

rce c

ertific

ate

PublicKey

PrivateKey

TemporalProxy

GRIDUser

Certificate

Page 6: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

Monitoring Systems

GISS-ALL

...

GRIS (Grid Resource Information Service)

GISS (Grid Index Information Service)

Page 7: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

Data transfer

RECURSO

GS

IFTP

GRID User -> Local User

GRID-MAPFILE

RES

OU

RC

EC

ER

TIF

ICA

TIO

N

TemporalProxy

GSIFTP - gsiftp://

GASS - https://

GA

SS

OK!

Local User

TemporalProxy

Always availableSlow speed

High SpeedIt has implementation problems

Page 8: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

Execute processesRSL SCRIPT

'&(parameter=valorY)(parameter1=valorX)...'

executable=<@ejecutable>arguments=<argumentos>stdin=<@origen entrada estándar>stdout=<@destino salida estándar>queue=<cola>environment=<variables de entorno>directory=<directorio inicial>...

GR

AM

CLIE

NT

Resource WITHOUT it

GR

AM

SER

VER

No GridJob

GridJob

LoadLeveler, NQE

GR

AM

SER

VER

Resource WITH Load Balancing Control

Page 9: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

Execute processes

main()

{

globus_module_activate( GLOBUS_GRAM_CLIENT_MODULE );

err = globus_gram_client_callback_allow( Callback, NULL, &contact );

if (err != GLOBUS_SUCCESS) ERROR();

desc = "(&executable=/home/ac/cela/Dimemas)(arguments= -o kk.out file.cfg)

(scratch_dir=/scratch/ac/cela)(directory=/home/ac/cela/)

(file_stage_in=(https://kandake:20352/home/ac/cela/file.cfg $(SCRATCH_DIRECOTRY)/file.cfg)

(https://kandake:20352/home/ac/cela/matrix.trf $(SCRATCH_DIRECOTRY)/matrix.trf))

(file_stage_out=($(SCRATCH_DIRECOTRY)/kk.out https://kandake:20352/home/ac/cela/kk.out)

(environment=(DIMEMAS_HOME/usr/local/cepba-tools)(TMPDIR $(SCRATCH_DIR)))";

err = globus_gram_client_job_request( contact, desc, GLOBUS_GRAM_PROTOCOL_JOB_STATE_DONE,

callback_contact, &job_contact );

if (err != GLOBUS_SUCCESS) ERROR();

err = globus_gram_client_job_status( job_contact, &job_status, &res );

if (err != GLOBUS_SUCCESS) ERROR();

while (1) { globus_poll(); globus_poll_blocking(); }

globus_gram_client_job_contact_free( job_contact );

globus_module_deactivate( GLOBUS_GRAM_CLIENT_MODULE );

}

Page 10: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar

• A programming paradigm which allows:

– Reduce the complexity of developing Grid Applications to the

minimum

– Automatic task parallelization over a GRID environment

• Basic idea: superscalar processors– Sequential control flow

– Well defined object name space, I/0 arguments to operation

– Automatic construction of precedence DAG

– Renaming

– Forwarding

– DAG scheduling, locality management

– Prediction, Speculation

Page 11: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar basis

• Code: – Sequential application in C/C++/Fortran90 with calls the

GridSuperscalar run-time (Execute call)

• Run-time performs:– Task identification (based on Execute primitive)

– Data dependency analysis: files are the objects

– Data dependence graph creation

– Task scheduling based on the graph

– File renaming to increase graph concurrency

– File forwarding

Page 12: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar behavior overview

for (i=0; i<N; i++

{

Execute(T1, “file1.txt”, “file2.txt”);

Execute(T2, “file4.txt”, “file5.txt”);

Execute(T3, “file2.txt”, “file5.txt”, “file6.txt”);

Execute(T4, “file7.txt”, “file8.txt”);

Execute(T5, “file6.txt”, “file8.txt”, “file9.txt”);

}

Application code

T10 T20

T30

T40

T50

T11 T21

T31

T41

T51

T12

Grid

Page 13: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar: user interface

• Actions to do when developing an application 1. Task definition: identify those subroutines/programs to be executed

in the Grid

2. Tasks’ interface definition: input/output files and input/output generic scalars

3. Write the sequential program using calls to the GridSuperscalar primitives (Execute)

Instruction set definition

Page 14: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar: code example

• Simple optimization search example: – perform Neval simulations– recalculate range of parameters– end when goal is reached

1. Tasks: FILTER, DIMEMAS, EXTRACT2. Parameters (current syntax):

<OP_NAME> <n_in_files> <n_in_generic> <n_out_files> <n_out_generic>

FILTER 1 2 1 0DIMEMAS 2 0 1 0EXTRACT 1 0 1 0

3. Sequential code

Page 15: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar: code example

Range = initial_range();while (!goal_reached() && (j<MAX_ITERS)){

for (i= 0; i< Neval; i++) {

L[i] = gen_rand_L_within_current_range(range);BW[i] = gen_rand_BW_within_current_range(range);Execute( FILTER, “bh.cfg”, L[i], BW[i], “bh_tmp.cfg”);Execute( DIMEM, “bh_tmp.cfg”,“trace.trf”,

“dim_out.txt”);Execute( EXTRACT, “dim_out.txt”, “final_result.txt”, “final_result.txt”);

} GS_Barrier();

generate_new_range(“final_result.txt”, &range);j++;

}

Page 16: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar: code example

• Worker:switch(atoi(argv[2]))

{

case FILTER: res = filter(argc, argv);

break;

case DIMEM: res = dimemas_funct(argc, argv);

break;

case EXTRACT: res = extract(argc, argv);

break;

default: printf("Wrong operation code\n");

break;

}

Page 17: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar run-time: task graph generation

Range = initial_range();while (!goal_reached() && (j<MAX_ITERS)){

for (i=0; i< Neval; i++){L[i] = gen_rand_L_within_current_range(range);BW[i] = gen_rand_BW_within_current_range(range);Execute( FILTER, “bh.cfg”, L[i], BW[i], “bh_tmp.cfg” );Execute( DIMEMAS, “bh_tmp.cfg”,“trace.trf”, “dim_out.txt” );Execute( EXTRACT, “dim_out.txt”, “final_result.txt” );

} GS_Barrier();

generate_new_range(“final_result.txt”, &range);j++;

}

FILTER

DIMEMAS

EXTRACT

FILTER

DIMEMAS

EXTRACT

FILTER

DIMEMAS

EXTRACT

Neval

BARRIER

Page 18: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar: task scheduling

FILTER

DIMEMAS

EXTRACT

FILTER

DIMEMAS

EXTRACT

FILTER

DIMEMAS

EXTRACT

BARRIER

… CIRI Grid

Page 19: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar: task scheduling

FILTER

DIMEMAS

EXTRACT

FILTER

DIMEMAS

EXTRACT

FILTER

DIMEMAS

EXTRACT

BARRIER

… CIRI Grid

• Additional function:– GS_barrier

Page 20: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

“f1_2”“f1_1”

GridSuperscalar: renaming

• Additional functions:– GS_open, GS_close

T1_1

T1_2

T1_3

T2_1

T2_2

T2_3

TN_1

TN_2

TN_3

…“f1” “f1” “f1”

Page 21: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

GridSuperscalar: file forwarding

T1

T2

f1

T1

T2

f1 (by socket)

• Prototype implemented with Dyninst

• Allows to execute different tasks on different hardware resources

• Initial tests

• High overhead due to worker behavior mutation

• Future tests with new Dyninst version 4.0

Page 22: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

Grid superscalar: current Globus implementation

• Previous prototype over Condor and MW• Current prototype over Globus 2.x, using the API• File transfer, security, … provided by Globus• Task submission

– globus_gram_client_job_request

• Asynchronous end of task synchronization– Asynchronous state-change callbacks mechanism provided by

Globus

– globus_gram_client_callback_allow

– callback_func function, provided by the programmer

Page 23: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

• NAS GRID benchmarks– Designed to provide an objective measure of GRID

systems – At the present it is only a paper and pencil description– The basic executables are the NAS Parallel Benchmarks

• BT, SP, LU => Solvers

• MG => Post-processor

• FT => Data visualization

• MF => Interpolation filter

– Only the data flow graphs are defined• 4 data flow graphs proposed (ED, HC, VP, MB)

GRID Performance Measurement

Page 24: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

• Embarrassingly Distributed (ED)– Parameter studies

GRID Performance Measurement

Launch

Report

SP

SP

SP

SP

SP

SP

SP

SP

SP

Page 25: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

• Helical Chain (HC)– A set of flow calculations

GRID Performance Measurement

Launch

Report

MFBT SP LUMF

MFBT SP LUMF

MFBT SP LUMF

MF

MF

Page 26: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

• Visualization Pipe (VP)– Solver + Post-processor + Visualization module

GRID Performance Measurement

Launch

Report

MFBT MG FTMF

MFBT MG FTMF

MFBT MG FTMF

Page 27: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

• Mixed Bag (MB)– VP with asymmetry

GRID Performance Measurement

Launch

Report

LU LU LU

MG MG MG

FT FT FT

MF MF MF

MF MF MF

Page 28: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

• Original scripteval "(./bin/bt.$CLASS <<-EOF ) | $THROWIT /dev/null $ASCII $NAME $CLASS $ITER $WIDTH $DEPTH $PID $VERBOSE EOF"

GRID Performance Measurement

Page 29: GridSuperscalar A programming model for GRID applications José Mª Cela cela@ac.upc.es cela@ciri.upc.es

• Present code in our programing model

sprintf( genIn, "%s %s %d %d %d %d %d %d", NAME, CLASE, ascii, iter, width, depth, pid, verbose);Execute( BT, "HC_BT_IN", genIn, "HC_BT_OUT" );

GRID Performance Measurement