Upload
zoe-carpenter
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Workshop on Grid Applications Programming, July 2004
GRID superscalar: a programming paradigm for GRID applications
CEPBA-IBM Research Institute
Raül Sirvent, Josep M. Pérez, Rosa M. Badia, Jesús Labarta
Workshop on Grid Applications Programming, July 2004
Outline
• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions
Workshop on Grid Applications Programming, July 2004
Objective
• Ease the programming of GRID applications
• Basic idea:
L3
Dir
ec
tory
/Co
ntr
ol
L2 L2 L2
LSU LSUIFUBXU
IDU IDU
IFUBXU
FPU FPU
FX
U
FX
UISU ISU
Grid
ns seconds/minutes/hours
Workshop on Grid Applications Programming, July 2004
Outline
• Objective• The essence• User’s interface• Automatic code generation• Current run-time features• Programming experiences• Future work• Conclusions
Workshop on Grid Applications Programming, July 2004
The essence
• Assembly language for the GRID– Simple sequential programming, well defined operations and
operands
– C/C++, Perl, …
• Automatic run time “parallelization”– Use architectural concepts from microprocessor design
• Instruction window (DAG), dependence analysis, scheduling, locality, renaming, forwarding, prediction, speculation,…
Workshop on Grid Applications Programming, July 2004
The essence
for (int i = 0; i < MAXITER; i++) {
newBWd = GenerateRandom();
subst (referenceCFG, newBWd, newCFG);
dimemas (newCFG, traceFile, DimemasOUT);
post (newBWd, DimemasOUT, FinalOUT);
if(i % 3 == 0) Display(FinalOUT);
}
fd = GS_Open(FinalOUT, R);
printf("Results file:\n"); present (fd);
GS_Close(fd);
Workshop on Grid Applications Programming, July 2004
The essenceSubst
DIMEMAS
Post
Subst
DIMEMAS
Post…
GS_open
Subst
DIMEMAS
Post
Subst
DIMEMAS
Post
Subst
DIMEMAS
Post
Subst
DIMEMAS
Post
Subst
DIMEMAS
Post
Display
Display
CIRI Grid
Workshop on Grid Applications Programming, July 2004
The essenceSubst
DIMEMAS
Post
Subst
DIMEMAS
Post…
GS_open
Subst
DIMEMAS
Post
Subst
DIMEMAS
Post
Subst
DIMEMAS
Post
Subst
DIMEMAS
Post
Subst
DIMEMAS
Post
Display
Display
CIRI Grid
Workshop on Grid Applications Programming, July 2004
Outline
• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions
Workshop on Grid Applications Programming, July 2004
• Three components:
– Main program
– Subroutines/functions
– Interface Definition Language (IDL) file
• Programming languages: C/C++, Perl
User’s interface
Workshop on Grid Applications Programming, July 2004
• A Typical sequential program
– Main program:
for (int i = 0; i < MAXITER; i++) {
newBWd = GenerateRandom();
subst (referenceCFG, newBWd, newCFG);
dimemas (newCFG, traceFile, DimemasOUT);
post (newBWd, DimemasOUT, FinalOUT);
if(i % 3 == 0) Display(FinalOUT);
}
fd = GS_Open(FinalOUT, R);
printf("Results file:\n"); present (fd);
GS_Close(fd);
User’s interface
Workshop on Grid Applications Programming, July 2004
User’s interface
void dimemas(in File newCFG, in File traceFile, out File DimemasOUT){ char command[200]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command);}
• A Typical sequential program– Subroutines/functions
void display(in File toplot){ char command[500];
sprintf(command, "./display.sh %s", toplot); GS_System(command);}
Workshop on Grid Applications Programming, July 2004
User’s interface
• GRID superscalar programming requirements
– Main program: open/close files with• GS_FOpen, GS_Open, GS_FClose, GS_Close
– Subroutines/functions• Temporal files on local directory or ensure uniqueness of name per
subroutine invocation• GS_System instead of system• All input/output files required must be passed as arguments
Workshop on Grid Applications Programming, July 2004
interface MC {void subst(in File referenceCFG, in double newBW, out File newCFG);void dimemas(in File newCFG, in File traceFile, out File DimemasOUT);void post(in File newCFG, in File DimemasOUT, inout File FinalOUT);void display(in File toplot)
};
• Gridifying the sequential program
– CORBA-IDL Like Interface: • In/Out/InOut files• Scalar values (in or out)
– The subroutines/functions listed in this file will be executed in a remote server in the Grid.
User’s interface
Workshop on Grid Applications Programming, July 2004
Outline
• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions
Workshop on Grid Applications Programming, July 2004
Automatic code generation: C
app.idl
app-worker.capp.c app-functions.c
server
gsstubgen
app.h
client
app-stubs.c
Workshop on Grid Applications Programming, July 2004
Outline
• Objective• The essence• User interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions
Workshop on Grid Applications Programming, July 2004
Run-time features
• Data dependence analysis – Detects RaW, WaR, WaW dependencies based on file parameters
– Tasks’ Directed Acyclic Graph is built based on these dependencies
• File renaming – WaW and WaR dependencies are avoidable with renaming
• Shared disks management– Supports shared working directories: NFS
– Allows shared input directories: mirrors of large DBs
Workshop on Grid Applications Programming, July 2004
Run-time features
• Resource brokering and task scheduling– Scheduling policy exploits file locality
– File transfer time vs execution time tradeoff considered
– Tasks submitted for execution as soon as the data dependencies are solved if resources are available
– End of tasks is detected by means of asynchronous callbacks
– Calls to globus:• globus_gram_client_job_request• globus_gram_client_job_status• globus_gram_client_job_cancel• globus_gram_client_callback_allow• globus_poll_blocking
Workshop on Grid Applications Programming, July 2004
Run-time features
• Communication between workers and master– Socket and file mechanisms provided
• Checkpointing at task level– Inter-task checkpointing
– Transparent to application developer
• All based in Globus Toolkit C APIs (version 2.x)– Provides authentication and authorization
– File transfers through gsiftp service
– Task handling with gram service
Workshop on Grid Applications Programming, July 2004
Outline
• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions
Workshop on Grid Applications Programming, July 2004
Programming experiences
• Parameter studies (Dimemas, Paramedir)– Algorithm flexibility
• NAS Grid Benchmarks– Improved component programs flexibility
– Reduced Grid level source code lines
• Bioinformatics application (production)– Improved portability (Globus vs just LoadLeveler)
– Reduced Grid level source code lines
• Pblade solution for bioinformatics
Workshop on Grid Applications Programming, July 2004
Outline
• Objective• The essence• User’s interface• Automatic code generation• Run-time features• Programming experiences• Ongoing work• Conclusions
Workshop on Grid Applications Programming, July 2004
Ongoing work
• fastDNAml– Computes the likelihood of various phylogenetic trees, starting with
aligned DNA sequences from a number of species (Indiana University code)
– Sequential and MPI (grid-enabled) versions available
– Porting to GRID superscalar • Lower pressure on communications than MPI• Simpler code than MPI
Workshop on Grid Applications Programming, July 2004
Ongoing work
• Run-time: exception handling try{
for (int n=0; n<=10; n++){
if (n>9) throw "Out of range";
myarray[n]='z';
}
}
catch (char * str){
cout << "Exception: " << str << endl;
}
• Interesting case: throw in workers, catch in main program
Workshop on Grid Applications Programming, July 2004
Ongoing work
• OGSA oriented resource broker, based on Globus Toolkit 3.x. • And more future work:
– Bindings to other basic middlewares• GAT, Ninf-G2
– New language bindings (shell script)
– Enhancements in the run-time performance guided by the performance analysis
Workshop on Grid Applications Programming, July 2004
Conclusions
• Presentation of the ideas of GRID superscalar
• Exists a viable way to ease the programming of Grid applications
• GRID superscalar run-time enables– Use of the resources in the Grid
– Exploiting the existent parallelism
Workshop on Grid Applications Programming, July 2004
How GAT can help us
• Middleware in a higher level (skip Globus details)• Avoid changing when Globus changes• Abstraction for using other Grid Middlewares
• Resource Broker• Intra-Task checkpointing mechanism
• Interesting GATObjects: – GATFile (GATFile_Copy, GATFile_Delete)
– GATResourceDescription, GATResourceBroker, GATJob
Workshop on Grid Applications Programming, July 2004
More information
• GRID superscalar home page:
http://people.ac.upc.es/rosab/index_gs.htm
• Rosa M. Badia, Jesús Labarta, Raül Sirvent, Josep M. Pérez, José M. Cela, Rogeli Grima, “Programming Grid Applications with GRID Superscalar”, Journal of Grid Computing, Volume 1 (Number 2): 151-170 (2003).