OPENMP Language Features - Part 1_2

Embed Size (px)

Citation preview

  • 7/27/2019 OPENMP Language Features - Part 1_2

    1/38

    Jgvldg

    OpenMP Language features

    By

    Sridhar Ranganathan B.E., M.Tech.,M.Phil., PGSEM

  • 7/27/2019 OPENMP Language Features - Part 1_2

    2/38

    Features provided by OpenMP

    OpenMP provides

    Directives

    Library functions

    Environment variables

    to create and control the execution of parallel

    programs

  • 7/27/2019 OPENMP Language Features - Part 1_2

    3/38

    Constructs I

    Parallel construct

    Work sharing constructs

    Loop construct

    Sections construct

    Single construct

    Data sharing

    No wait

    Schedule

  • 7/27/2019 OPENMP Language Features - Part 1_2

    4/38

    Constructs - II

    These constructs will enable the programmer

    to orchestrate actions of different threads

    Barrier construct

    Critical construct

    Atomic construct

    Locks

    Master construct

  • 7/27/2019 OPENMP Language Features - Part 1_2

    5/38

    Terminology

    OpenMP directive In C/C++, a #pragma thatspecifies OpenMP program behaviour

    Executable directive An OpenMP directive

    that is NOT declarative; that is it may beplaced in an executable context

    Construct An OpenMP executable directive

    and the associated statement,loop prstructured block [ lexical extent of anexecutable directive]

  • 7/27/2019 OPENMP Language Features - Part 1_2

    6/38

    Requirements for OpenMP

    OpenMP requires well structured programs

    Constructs are associated with statements,loops or executable blocks

    In C/C++, a structured block is defined to bean executable statement, possibly acompound statement with a single entry at

    the top and a single exit at the bottom Point of entry cannot be a labelled statement

    Point of exit cannot be a branch of any type

  • 7/27/2019 OPENMP Language Features - Part 1_2

    7/38

    Parallel construct

    This is specified as

    #pragma omp parallel clause1 clause2

    structured block

  • 7/27/2019 OPENMP Language Features - Part 1_2

    8/38

    Use of parallel construct

    This construct is used to specify the computations that would beexecuted in parallel

    Parts of the program that are NOT enclosed by the construct will beexecuted serially

    When a thread encounters this statement

    A team of threads is created to execute the associated parallel region This construct does NOT distribute the work

    For distribution we need additional clauses

    If you do not specify additional clauses, same work will be done by allthe threads

    At the end of the parallel region, there is an implied barrier whichmakes all the threads to wait until the work inside the regions iscompleted

    Only the initial thread continues execution after the end of theparallel region

  • 7/27/2019 OPENMP Language Features - Part 1_2

    9/38

    Explanation of the parallel construct

    contd..

    The thread that encounters the parallelconstruct is the Master thread

    Each thread is assigned a unique thread id

    They range from zero for master thread toone less than maximum threads

    Each thread is allowed a different path of

    execution using an if clause Thread id could be found by

    omp_get_thread_num() function.

  • 7/27/2019 OPENMP Language Features - Part 1_2

    10/38

    Clauses supported by parallel

    construct

    if (scalar-expression)

    num_threads(scalar_expression)

    private(list)

    firstprivate(list)

    shared(list)

    default(none|shared)

    copyn(list)

    Reduction(operatorlist)

  • 7/27/2019 OPENMP Language Features - Part 1_2

    11/38

    Restrictions on the parallel construct

    A program should NOT branch into or out of theparallel region. If it does, then the behaviour isundefined.

    A program should NOT depend on any orderingof evaluations of the clauses or any side effects

    At most one if clause can appear on thedirective

    At most one num_threads clause can appear onthe directive. The expression of num_threadsclause must evaluate to a positive integer

  • 7/27/2019 OPENMP Language Features - Part 1_2

    12/38

    Sharing the work among threads

    A worksharing construct specifies a region ofcode whose work is to be distributed amongmany threads and also specifies the manner

    in which the work in the region needs to beparceled out.

    There are three constructs

    #pragma omp for

    #pragma omp sections

    #pragma omp single

  • 7/27/2019 OPENMP Language Features - Part 1_2

    13/38

    Rules for work sharing constructs

    Each work sharing region must be encounteredby all threads in a team or by none at all

    The sequence of work sharing regions and barrier

    regions encountered must be the same for everythread in the team

    A work sharing construct does NOT launch anynew threads

    It does NOT have any barrier on entry

    It has an implicit barrier at the end

  • 7/27/2019 OPENMP Language Features - Part 1_2

    14/38

    Loop construct

    #pragma omp for

    for (init-expr; var relop b; incr expr)

    init-expr must be an integer expression

    b is also an integer expression

    incr expr must also be an integer expression

    Using ++,+=,--,-= Alternatively it could be var = var+expr

  • 7/27/2019 OPENMP Language Features - Part 1_2

    15/38

    Restrictions for the loop construct

    Use of this is limited to those kinds of loopswhere the number of iterations can beCounted

    Example for loops where integer variable isused as a counter whose value is incrementedby a fixed number for each iteration till aupper or lower bound is reached

    It means compiler should be able to countnumber of iterations and distribute the load

  • 7/27/2019 OPENMP Language Features - Part 1_2

    16/38

    section/sections construct

    Using sections construct, we can assigndifferent threads to carry on different kinds ofwork

    Using sections construct we can specifydifferent code regions which will be executedby one of the threads

    There are two directives #pragma omp section

    #pragma omp sections

  • 7/27/2019 OPENMP Language Features - Part 1_2

    17/38

    Example of section/sections code

    #pragma omp parallel

    {

    #pragma omp sections

    {#pragma omp section

    structured block

    #pragma omp section

    structured block}

    }

  • 7/27/2019 OPENMP Language Features - Part 1_2

    18/38

    Explanation of the section/sections

    construct

    #pragma omp sections indicate the start of

    the construct

    #pragma omp section marks each

    independent section

    At run time, the specified code blocks are

    executed by threads in the team

    Each thread executes one code block at a time

    Each code block will be executed at once

  • 7/27/2019 OPENMP Language Features - Part 1_2

    19/38

    Explanation of the section/sections

    contd..

    If there are fewer threads than code blocks,

    some or all threads may execute multiple code

    blocks

    If there are fewer code blocks than threads,

    some threads will be idle

    Most common use of section/sections

    construct is to execute functions in parallel

  • 7/27/2019 OPENMP Language Features - Part 1_2

    20/38

    Single construct

    Single construct is used to specify that exactly

    one thread must execute the specified part

    We do not care which thread really execute

    this

    The thread executing this can differ from run

    to run

    This is used in initialization of variables

  • 7/27/2019 OPENMP Language Features - Part 1_2

    21/38

    Example use of single construct

    #pragma omp single clause1 clause2

    structured block

  • 7/27/2019 OPENMP Language Features - Part 1_2

    22/38

    Master construct

    This is similar to single construct

    This guarantees that the work will be done by

    Master thread

    The Master construct does NOT have an

    implied barrier at entry or exit

    This may create problmes

    Solution is to have an explicit barrier

    statement

  • 7/27/2019 OPENMP Language Features - Part 1_2

    23/38

    Example use of Master construct

    #pragma omp master

    structured block

    #pragma omp barrier

  • 7/27/2019 OPENMP Language Features - Part 1_2

    24/38

    Clauses to control parallel and

    worksharing constructs

    shared

    private

    lastprivate firstprivate

    default

    nowait schedule

  • 7/27/2019 OPENMP Language Features - Part 1_2

    25/38

    Shared clause

    The shared clause specifies which data will be sharedamong threads executing the region it is associatedwith

    There will be an unique instance of the variable

    Each thread can freely read and modify the value

    A note of caution is that multiple threads may try toupdate the same variable simulataneously

    Synchronization constructs are available to resolve thisissue

    A good use is when the threads only read this variable

  • 7/27/2019 OPENMP Language Features - Part 1_2

    26/38

    Private clause

    The private clause ensures that each thread is

    given a private copy of the variable

    Each variable in the private list is replicated

    such that each thread gets its own copy

  • 7/27/2019 OPENMP Language Features - Part 1_2

    27/38

    Firstprivate clause

    This is used if we need to initialize private

    variables prior to the region in which it will be

    used

    Variables that are used in firstprivate are

    private variables but they will be initialized to

    a value which a variable with the same name

    happens just before entry into the parallelregion

  • 7/27/2019 OPENMP Language Features - Part 1_2

    28/38

    Lastprivate clause

    If a value of a private variable is needed afterthe parallel region is over, this clause is used

    In the case of a work-shared loop, the object

    will have a value from the iteration of the loopthat would be last in a sequential execution

    In the case of a use in a sections statement,

    the object gets assigned the value that it hasat the end of the lexically last sectionsconstruct

  • 7/27/2019 OPENMP Language Features - Part 1_2

    29/38

    Default clause

    The default clause is to give variables a defaultsharing attribute

    In C/C++, the default is none or shared

    If default (shared) is given, all variables otherthan private are shared variables

    If default(none) is given, programmer is forced

    to think about variable and to specify eachvariable in private list or shared list

    Default(none) is recommended

  • 7/27/2019 OPENMP Language Features - Part 1_2

    30/38

    Nowait clause

    Nowait clause allows the programmer to fine tune aprograms performance

    When we add this clause to a construct, the barrier atthe end of the associated construct will be suppressed

    Usage: when a parallel program runs correctly, weidentify places where barrier is not necessary andintroduce this clause

    When a thread is finished with the work associatedwith the parallel loop it continues without waiting forothers to complete their work.

    Example: #pragma omp for nowait

  • 7/27/2019 OPENMP Language Features - Part 1_2

    31/38

    Schedule clause

    This is supported in the loop construct only as

    follows #pragma omp parallel schedule(kind,

    chunksize)

    There are four kinds of scheduling

    Static

    Dynamic

    Guided

    runtime

  • 7/27/2019 OPENMP Language Features - Part 1_2

    32/38

    Schedule clause contd..Schedule kind Description

    Static Iterations are divided into chunks of sie

    chunk_size; the chunks are assigned to

    the threads statically in a round robin

    fashion; the last chunk to be assigned may

    have a smaller number of iterations; with

    no chunk size specified the iteration spaceis divided into chunks that are

    approximately equal in size. Each thread

    is assigned at most one chunk

    Dynamic Iterations are assigned to the threads as

    the threads request them; the threadexecutes the chunk of iterations

    controlled through chunk_size parameter;

    then requests another chunk until there

    are now more chunks to work on; last

    chunk may have fewrer iterationsl

    whenno chunk size specifed, it defaults toone..

  • 7/27/2019 OPENMP Language Features - Part 1_2

    33/38

    Schedule clause contd further

    Schedule kind Description

    Guided Iterations are assigned to the threads as

    the threads request them; The thread

    executes the chunk as controlled through

    the chunk_size parameter and then

    requests another chunk, until there areno more chunks. For a chunk_size of 1,

    the size of each chunk is proportional to

    the number of unassigned iterations/no

    of threads decreasing to 1; For a chunk

    size of k, the size of each chunk is

    determined in the same way, with therestricion that the chunks do not contain

    fewer than k iterations (except last

    chunk); for no chunk_size, it defaults to 1.

    Runtime Decision made at runtime; schedule and

    chunk size are set through environment

    variable OMP_SCHEDULE

  • 7/27/2019 OPENMP Language Features - Part 1_2

    34/38

    OpenMP synchronization constructs

    Barrier construct #pragma omp barrier

    Ordered construct #pragma omp ordered

    Allows one to execute a structured block within a parallel loop in

    sequential order Critical construct

    This provides a means to ensure that multiple threads do not attemptto update the same shared data simultaneously [ which is calledcritical region]

    An optional name is to be given to this which must be unique globally

    When a thread enters a critical region, it waits until no other thread isexecuting it

    #pragma omp critical name Structured block

  • 7/27/2019 OPENMP Language Features - Part 1_2

    35/38

  • 7/27/2019 OPENMP Language Features - Part 1_2

    36/38

    OpenMP Locks

    These are semaphores OpenMP provides a set of lowlevel general purpose locking

    routines

    They provide greater flexibility for synchronization

    Nested locks are also possible

    Definition

    omp_lock_t *var1;

    Routines for simple locks

    Initialization omp_init_lock(var1);

    Set lock omp_set_lock(var1)

    Test lock omp_test_lock(var1)

    Unset lock omp_unset_lock(var1)

    Destroy lock omp_destroy_lock(var1)

  • 7/27/2019 OPENMP Language Features - Part 1_2

    37/38

    Interaction with environment

    OpenMP defines internal control variables They govern the behaviour of the program at runtime

    They cannot be modified or accessed at the applicationlevel

    They can be queried or accessed by OpenMP environmentvariables

    These variables are of the following types Nthreads var

    Dyn var

    Nest var Run-sched var

    Def-sched var

  • 7/27/2019 OPENMP Language Features - Part 1_2

    38/38

    Environment variables

    OMP_NUM_THREADS omp_set_num_threads()

    Omp_get_num_threads()

    OMP_DYNAMIC (boolean) Omp_set_dynamic()

    Omp_get_dynamic()

    OMP_NESTED Omp_set_nested()

    Omp_get_nested()

    OMP_SCHEDULE