1

Click here to load reader

Bulk-Synchronous-Parallel - BSP

Embed Size (px)

Citation preview

Page 1: Bulk-Synchronous-Parallel - BSP

CS430 Presentation 1

Page 2: Bulk-Synchronous-Parallel - BSP

Bridging model: Conceptual bridging between hardware and software.

Sequential: Von Neumann Model Central Processing Unit Memory Input/output. Processing through one physical location.

Parallel: BSP Efficient Bridging: It just not intended to the Hardware or Software particularly but

something in between. As hardware cost decreases to an extend in the recent times, Parallel computers increases

vastly Processing through many physical location.

2

Page 3: Bulk-Synchronous-Parallel - BSP

Aim of the proposed model: considering small number of processors Efficient universality results Optimal stimulations Avoiding log losses in efficiency Parallelism for general purpose

Important aspects: Avoid onerous burden on memory management, assigning communication. Performing low level synchronization PRAM is ideal Turing theory

3

Page 4: Bulk-Synchronous-Parallel - BSP

It is assumed for V virtual processors to run on P physical processors (V=PlogP).

The High level Language in this could allow shared virtual address space.

This slack is exploited by the compiler to schedule and pipeline computation and communication efficiently (sometimes unavoidable).

For purpose of reducing the amount of slack required, improving constant factor in runtime or avoid hashing.

Programmer may choose control of these tasks

4

Page 5: Bulk-Synchronous-Parallel - BSP

Bulk-Synchronous Parallel Computer(BSPC) consists of three attributes:

A number of components – each performing processing or/& memory functions.

A router – delivers messages point to point between pair of components.

Facilities for synchronizing all or subset of components at regular interval of L time unit. L

is periodicity parameter.

5

Page 6: Bulk-Synchronous-Parallel - BSP

6

Page 7: Bulk-Synchronous-Parallel - BSP

Computation in sequence of supersteps. Each superstep, each component is allocated a task consisting of some combination of

local computation steps, message transmission and (implicitly) message arrival from other components.

At each period of L time units, a global check is made to determine whether superstephascompleted by all the components. If yes, the machine proceeds to the next superstep. Otherwise, the next period L is allocated to the unfinished superstep.

Router: Router-components are separated implies communication and task computation are

separated. Intended for implementing storage access between components. Assumes no combining, duplication and broadcasting facilities.

7

Page 8: Bulk-Synchronous-Parallel - BSP

Synchronization mechanism: Describes captures in a simple way the idea of global synchronization at a controllable

level of coarness. Hardware does this synchronization without overburdening programmers. Synch. can be switched off for any subset of components – independent process Communication won’t be interrupted It can still choose alternate type of synch. if needed.

Value of periodicity L may be controlled by the program, even at runtime. Hardware intends to set lower bound of L but Software aims for upper bound of L. Efficient way – independent processes are assigned a task of L steps

8