Upload
ram-kiran
View
216
Download
0
Embed Size (px)
Citation preview
7/27/2019 OPENMP Language Features - Part 1_2
1/38
Jgvldg
OpenMP Language features
By
Sridhar Ranganathan B.E., M.Tech.,M.Phil., PGSEM
7/27/2019 OPENMP Language Features - Part 1_2
2/38
Features provided by OpenMP
OpenMP provides
Directives
Library functions
Environment variables
to create and control the execution of parallel
programs
7/27/2019 OPENMP Language Features - Part 1_2
3/38
Constructs I
Parallel construct
Work sharing constructs
Loop construct
Sections construct
Single construct
Data sharing
No wait
Schedule
7/27/2019 OPENMP Language Features - Part 1_2
4/38
Constructs - II
These constructs will enable the programmer
to orchestrate actions of different threads
Barrier construct
Critical construct
Atomic construct
Locks
Master construct
7/27/2019 OPENMP Language Features - Part 1_2
5/38
Terminology
OpenMP directive In C/C++, a #pragma thatspecifies OpenMP program behaviour
Executable directive An OpenMP directive
that is NOT declarative; that is it may beplaced in an executable context
Construct An OpenMP executable directive
and the associated statement,loop prstructured block [ lexical extent of anexecutable directive]
7/27/2019 OPENMP Language Features - Part 1_2
6/38
Requirements for OpenMP
OpenMP requires well structured programs
Constructs are associated with statements,loops or executable blocks
In C/C++, a structured block is defined to bean executable statement, possibly acompound statement with a single entry at
the top and a single exit at the bottom Point of entry cannot be a labelled statement
Point of exit cannot be a branch of any type
7/27/2019 OPENMP Language Features - Part 1_2
7/38
Parallel construct
This is specified as
#pragma omp parallel clause1 clause2
structured block
7/27/2019 OPENMP Language Features - Part 1_2
8/38
Use of parallel construct
This construct is used to specify the computations that would beexecuted in parallel
Parts of the program that are NOT enclosed by the construct will beexecuted serially
When a thread encounters this statement
A team of threads is created to execute the associated parallel region This construct does NOT distribute the work
For distribution we need additional clauses
If you do not specify additional clauses, same work will be done by allthe threads
At the end of the parallel region, there is an implied barrier whichmakes all the threads to wait until the work inside the regions iscompleted
Only the initial thread continues execution after the end of theparallel region
7/27/2019 OPENMP Language Features - Part 1_2
9/38
Explanation of the parallel construct
contd..
The thread that encounters the parallelconstruct is the Master thread
Each thread is assigned a unique thread id
They range from zero for master thread toone less than maximum threads
Each thread is allowed a different path of
execution using an if clause Thread id could be found by
omp_get_thread_num() function.
7/27/2019 OPENMP Language Features - Part 1_2
10/38
Clauses supported by parallel
construct
if (scalar-expression)
num_threads(scalar_expression)
private(list)
firstprivate(list)
shared(list)
default(none|shared)
copyn(list)
Reduction(operatorlist)
7/27/2019 OPENMP Language Features - Part 1_2
11/38
Restrictions on the parallel construct
A program should NOT branch into or out of theparallel region. If it does, then the behaviour isundefined.
A program should NOT depend on any orderingof evaluations of the clauses or any side effects
At most one if clause can appear on thedirective
At most one num_threads clause can appear onthe directive. The expression of num_threadsclause must evaluate to a positive integer
7/27/2019 OPENMP Language Features - Part 1_2
12/38
Sharing the work among threads
A worksharing construct specifies a region ofcode whose work is to be distributed amongmany threads and also specifies the manner
in which the work in the region needs to beparceled out.
There are three constructs
#pragma omp for
#pragma omp sections
#pragma omp single
7/27/2019 OPENMP Language Features - Part 1_2
13/38
Rules for work sharing constructs
Each work sharing region must be encounteredby all threads in a team or by none at all
The sequence of work sharing regions and barrier
regions encountered must be the same for everythread in the team
A work sharing construct does NOT launch anynew threads
It does NOT have any barrier on entry
It has an implicit barrier at the end
7/27/2019 OPENMP Language Features - Part 1_2
14/38
Loop construct
#pragma omp for
for (init-expr; var relop b; incr expr)
init-expr must be an integer expression
b is also an integer expression
incr expr must also be an integer expression
Using ++,+=,--,-= Alternatively it could be var = var+expr
7/27/2019 OPENMP Language Features - Part 1_2
15/38
Restrictions for the loop construct
Use of this is limited to those kinds of loopswhere the number of iterations can beCounted
Example for loops where integer variable isused as a counter whose value is incrementedby a fixed number for each iteration till aupper or lower bound is reached
It means compiler should be able to countnumber of iterations and distribute the load
7/27/2019 OPENMP Language Features - Part 1_2
16/38
section/sections construct
Using sections construct, we can assigndifferent threads to carry on different kinds ofwork
Using sections construct we can specifydifferent code regions which will be executedby one of the threads
There are two directives #pragma omp section
#pragma omp sections
7/27/2019 OPENMP Language Features - Part 1_2
17/38
Example of section/sections code
#pragma omp parallel
{
#pragma omp sections
{#pragma omp section
structured block
#pragma omp section
structured block}
}
7/27/2019 OPENMP Language Features - Part 1_2
18/38
Explanation of the section/sections
construct
#pragma omp sections indicate the start of
the construct
#pragma omp section marks each
independent section
At run time, the specified code blocks are
executed by threads in the team
Each thread executes one code block at a time
Each code block will be executed at once
7/27/2019 OPENMP Language Features - Part 1_2
19/38
Explanation of the section/sections
contd..
If there are fewer threads than code blocks,
some or all threads may execute multiple code
blocks
If there are fewer code blocks than threads,
some threads will be idle
Most common use of section/sections
construct is to execute functions in parallel
7/27/2019 OPENMP Language Features - Part 1_2
20/38
Single construct
Single construct is used to specify that exactly
one thread must execute the specified part
We do not care which thread really execute
this
The thread executing this can differ from run
to run
This is used in initialization of variables
7/27/2019 OPENMP Language Features - Part 1_2
21/38
Example use of single construct
#pragma omp single clause1 clause2
structured block
7/27/2019 OPENMP Language Features - Part 1_2
22/38
Master construct
This is similar to single construct
This guarantees that the work will be done by
Master thread
The Master construct does NOT have an
implied barrier at entry or exit
This may create problmes
Solution is to have an explicit barrier
statement
7/27/2019 OPENMP Language Features - Part 1_2
23/38
Example use of Master construct
#pragma omp master
structured block
#pragma omp barrier
7/27/2019 OPENMP Language Features - Part 1_2
24/38
Clauses to control parallel and
worksharing constructs
shared
private
lastprivate firstprivate
default
nowait schedule
7/27/2019 OPENMP Language Features - Part 1_2
25/38
Shared clause
The shared clause specifies which data will be sharedamong threads executing the region it is associatedwith
There will be an unique instance of the variable
Each thread can freely read and modify the value
A note of caution is that multiple threads may try toupdate the same variable simulataneously
Synchronization constructs are available to resolve thisissue
A good use is when the threads only read this variable
7/27/2019 OPENMP Language Features - Part 1_2
26/38
Private clause
The private clause ensures that each thread is
given a private copy of the variable
Each variable in the private list is replicated
such that each thread gets its own copy
7/27/2019 OPENMP Language Features - Part 1_2
27/38
Firstprivate clause
This is used if we need to initialize private
variables prior to the region in which it will be
used
Variables that are used in firstprivate are
private variables but they will be initialized to
a value which a variable with the same name
happens just before entry into the parallelregion
7/27/2019 OPENMP Language Features - Part 1_2
28/38
Lastprivate clause
If a value of a private variable is needed afterthe parallel region is over, this clause is used
In the case of a work-shared loop, the object
will have a value from the iteration of the loopthat would be last in a sequential execution
In the case of a use in a sections statement,
the object gets assigned the value that it hasat the end of the lexically last sectionsconstruct
7/27/2019 OPENMP Language Features - Part 1_2
29/38
Default clause
The default clause is to give variables a defaultsharing attribute
In C/C++, the default is none or shared
If default (shared) is given, all variables otherthan private are shared variables
If default(none) is given, programmer is forced
to think about variable and to specify eachvariable in private list or shared list
Default(none) is recommended
7/27/2019 OPENMP Language Features - Part 1_2
30/38
Nowait clause
Nowait clause allows the programmer to fine tune aprograms performance
When we add this clause to a construct, the barrier atthe end of the associated construct will be suppressed
Usage: when a parallel program runs correctly, weidentify places where barrier is not necessary andintroduce this clause
When a thread is finished with the work associatedwith the parallel loop it continues without waiting forothers to complete their work.
Example: #pragma omp for nowait
7/27/2019 OPENMP Language Features - Part 1_2
31/38
Schedule clause
This is supported in the loop construct only as
follows #pragma omp parallel schedule(kind,
chunksize)
There are four kinds of scheduling
Static
Dynamic
Guided
runtime
7/27/2019 OPENMP Language Features - Part 1_2
32/38
Schedule clause contd..Schedule kind Description
Static Iterations are divided into chunks of sie
chunk_size; the chunks are assigned to
the threads statically in a round robin
fashion; the last chunk to be assigned may
have a smaller number of iterations; with
no chunk size specified the iteration spaceis divided into chunks that are
approximately equal in size. Each thread
is assigned at most one chunk
Dynamic Iterations are assigned to the threads as
the threads request them; the threadexecutes the chunk of iterations
controlled through chunk_size parameter;
then requests another chunk until there
are now more chunks to work on; last
chunk may have fewrer iterationsl
whenno chunk size specifed, it defaults toone..
7/27/2019 OPENMP Language Features - Part 1_2
33/38
Schedule clause contd further
Schedule kind Description
Guided Iterations are assigned to the threads as
the threads request them; The thread
executes the chunk as controlled through
the chunk_size parameter and then
requests another chunk, until there areno more chunks. For a chunk_size of 1,
the size of each chunk is proportional to
the number of unassigned iterations/no
of threads decreasing to 1; For a chunk
size of k, the size of each chunk is
determined in the same way, with therestricion that the chunks do not contain
fewer than k iterations (except last
chunk); for no chunk_size, it defaults to 1.
Runtime Decision made at runtime; schedule and
chunk size are set through environment
variable OMP_SCHEDULE
7/27/2019 OPENMP Language Features - Part 1_2
34/38
OpenMP synchronization constructs
Barrier construct #pragma omp barrier
Ordered construct #pragma omp ordered
Allows one to execute a structured block within a parallel loop in
sequential order Critical construct
This provides a means to ensure that multiple threads do not attemptto update the same shared data simultaneously [ which is calledcritical region]
An optional name is to be given to this which must be unique globally
When a thread enters a critical region, it waits until no other thread isexecuting it
#pragma omp critical name Structured block
7/27/2019 OPENMP Language Features - Part 1_2
35/38
7/27/2019 OPENMP Language Features - Part 1_2
36/38
OpenMP Locks
These are semaphores OpenMP provides a set of lowlevel general purpose locking
routines
They provide greater flexibility for synchronization
Nested locks are also possible
Definition
omp_lock_t *var1;
Routines for simple locks
Initialization omp_init_lock(var1);
Set lock omp_set_lock(var1)
Test lock omp_test_lock(var1)
Unset lock omp_unset_lock(var1)
Destroy lock omp_destroy_lock(var1)
7/27/2019 OPENMP Language Features - Part 1_2
37/38
Interaction with environment
OpenMP defines internal control variables They govern the behaviour of the program at runtime
They cannot be modified or accessed at the applicationlevel
They can be queried or accessed by OpenMP environmentvariables
These variables are of the following types Nthreads var
Dyn var
Nest var Run-sched var
Def-sched var
7/27/2019 OPENMP Language Features - Part 1_2
38/38
Environment variables
OMP_NUM_THREADS omp_set_num_threads()
Omp_get_num_threads()
OMP_DYNAMIC (boolean) Omp_set_dynamic()
Omp_get_dynamic()
OMP_NESTED Omp_set_nested()
Omp_get_nested()
OMP_SCHEDULE