23
Hierarchically Tiled Arrays Taylor Finnell Alex Schwartz

Hierarchically Tiled Arrays

  • Upload
    cutter

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Hierarchically Tiled Arrays. Taylor Finnell Alex Schwartz. Overview. Background and g oals of HTA approach Tiling and Locality Structure of HTAs Creating an HTA Accessing an HTA’s components Operations Communication Operations Global Computations Implementations of HTA (Libraries) - PowerPoint PPT Presentation

Citation preview

Page 1: Hierarchically Tiled Arrays

Hierarchically Tiled Arrays Taylor FinnellAlex Schwartz

Page 2: Hierarchically Tiled Arrays

Overview• Background and goals of HTA approach• Tiling and Locality• Structure of HTAs• Creating an HTA• Accessing an HTA’s components• Operations• Communication Operations• Global Computations

• Implementations of HTA (Libraries)• MATLAB• C++

• Conclusions

Page 3: Hierarchically Tiled Arrays

Background and Goals• Developed in 2006 by researchers from University of Illinois,

IBM, and University of Coruña (Spain)• HTA is the first attempt to make tiles part of a language• Main goals of HTA:• Extend a sequential language to support parallel computation• Facilitate

• Control of locality• Communication in parallel programs

Page 4: Hierarchically Tiled Arrays

Locality• Definition• The phenomenon of the same value or related storage locations

being frequently accessed• Types• Temporal Locality

• Reuse of specific data within relatively small time durations• Spatial Locality

• Use of data elements within relatively close storage locations• Sequential Locality

• Special case of Spatial Locality• Data elements are arranged and accessed linearly

Page 5: Hierarchically Tiled Arrays

Tiling• The structured division of memory to facilitate parallel

computation• Used to organize computations so that communication costs in

parallel programs are reduced• Important in scientific computing• Many iterative and recursive algorithms can be expressed

naturally with tiles• Planned by programmer during program design• Some compilers have automatic tiling• Not always effective in tiling of complex algorithms

Page 6: Hierarchically Tiled Arrays

Structure of HTA• Type of tiled array object• Tiles can be composed of additional nested tiles• Uses array operations to manipulate tiles directly• Parallel computations and interprocessor communications are

represented by assignments between HTAs

Page 7: Hierarchically Tiled Arrays

Creating an HTA• From an existing array (pictured)• Empty HTA of the

same size can becreated usinghta(3,3)

• Empty HTAs must beassigned content tobe considered com-plete

Page 8: Hierarchically Tiled Arrays

Accessing an HTA

• Accessing Tiles• C{2,1} refers to lower

left tile• Accessing Elements Directly• C(5,4) refers to specific element, disregarding the tile structure• Note parenthesis vs. brackets

• Accessing Elements Relatively• C{2,1}{1,2}(1,2) and C{2,1}(1,4)

refer to the same element as C(5,4)

Page 9: Hierarchically Tiled Arrays

Accessing an HTA (cont.)

• Regional Access (Flattening)• Disregards tiling, returns a standard array• C(1:2, 3:6) in example returns 2x4 matrix

• Logical Indexing/Selection• Matrix of boolean values

with same dimensions asthe HTA

Page 10: Hierarchically Tiled Arrays

Operations on HTAs• Two types of operations• Communication Operations• Global Computations

Page 11: Hierarchically Tiled Arrays

Communication Operations• Represented as assignments on distributed HTAs• Array Operations• HTA's provide overloaded array operations (circular shift,

transpose, arithmetic operators, etc.) so that you can perform these operations at the tile level• Ex. Circular Shift will shift whole tiles instead of individual array

elements

Page 12: Hierarchically Tiled Arrays

Communication Operations (cont.)• Example: Cannon’s algorithm• Distributed algorithm for matrix multiplication

• Uses circular shift to distribute the work• Normal implementation • shifts rows and columns• Performs the multiplication element by element

• HTA implementation• Shifts tiles• Performs the multiplication as matrix multiplication of tiles

• Each processor owns a tile• Advantages

• Aggregates data into a tile for communication• Increased locality due to single matrix multiplication• Locality further increased if more levels of tiling are used

Page 13: Hierarchically Tiled Arrays

Cannon’s Algorithm• Can use an alternative matrix

multiplication function toincrease locality

returns 0 when A is a scalar or a matrix

If it's not, recursively proceed down HTA hierarchy until leaf tile reached

Level(A) •

returns number of tiles in H along the i-th dimension

○size(H,i)•

Page 14: Hierarchically Tiled Arrays

Global Computations• Passing an HTA to a function/operation• Operates in parallel on a set of tiles from an HTA distributed

across a parallel machine• parHTA(@func, H)

• @func is a function pointer

• Reduction

• An operation applied to all the components of a vector to produce a scalar• Applied to all the components of a n-dimensional array to produce a

n-1 dimensional array• Matrices have row reduction and column reduction

• Ex. Matrix Vector multiplication

Page 15: Hierarchically Tiled Arrays

Global Computations (cont.)• Ex. Matrix Vector multiplication

• Matrix-vector multiplication takes place locally at each processor (the A*B part)

Page 16: Hierarchically Tiled Arrays

Implementations of HTA• MATLAB• Advantages• Disadvantages

• C++• Allocation approach• Advantages over MATLAB

• Comparison to existing paradigm (MPI)

Page 17: Hierarchically Tiled Arrays

MATLAB implementation• Advantages• MATLAB's syntax for array access is convenient for representing

HTA access (“Triplet Notation”)• Disadvantages• Not as efficient as C++'s implementation• Interpreted MATLAB limits efficiency due to immense overhead• Lots of temporary variables created to hold partial expression

results• Parameters passed by value

• Assignment statements also replicate data

Page 18: Hierarchically Tiled Arrays

C++ Implementation• Allocation/construction of HTAs• HTAs represented as composite objects with methods to operate

on HTAs• HTAs allocated on heap, return a handle/reference

• All HTA access occurs through this handle• Uses reference counting for garbage collection

• Advantages over MATLAB• Higher performance than MATLAB implementation• Memory layout is controllable by the user unlike MATLAB

Page 19: Hierarchically Tiled Arrays

HTA vs. MPI (Message Passing Interface)

• HTA is integrated into languages using operator overloading and polymorphism• Improved readability

• HTA requires much less code• Communication takes place through assignment rather than

• send and receive instructions • packing and unpacking data• checking boundary conditions

• HTAs are partitioned and distributed through a single constructor• MPI has to compute • limits of data owned by each processor• Neighbors of a processor• Acitve set of processors, etc.

• Communication is made more obvious

Page 20: Hierarchically Tiled Arrays

HTA vs. MPI (cont.)

Page 21: Hierarchically Tiled Arrays

Conclusions• Ideal for algorithms that can be naturally expressed in terms of

blocks• Communication is less involved versus other parallel

programming approaches• Slow and elegant in MATLAB, but fast and verbose in C++• Due to lack of operator overloading in the C++ library

• Good for architectures consisting of multiple independent processing cores• One example given by the authors is the CELL processor

Page 22: Hierarchically Tiled Arrays

ReferencesBikshandi, G. (2006, October 12). Hierarchically Tiled Arrays: A Programming paradigm for parallelism and locality. Urbana, IL, USA.:www.cs.uiuc.edu/class/fa06/cs498dp/notes/hta.pdf

Bikshandi, Guo, & Hoeflinger. (2006, March 29). Programming for Parallelism and Locality with Hierarchically Tiled Arrays. New York, NY, USA.:https://wiki.engr.illinois.edu/download/attachments/195766122/p48-bikshandi.pdf

Various. (2010, January 15). Programming for Parallelism and Locality with Hierarchically Tiled Arrays. Retrieved July 12, 2012, from CSU Computer Science Department Wiki: https://www.cs.colostate.edu/wiki/Programming_for_Parallelism_and_Locality_with_Hierarchically_Tiled_Arrays.

Various. (2012, May 16). Locality of Reference. Retrieved July 12, 2012, from Wikipedia: http://en.wikipedia.org/wiki/Locality_of_reference

Page 23: Hierarchically Tiled Arrays

Hierarchically Tiled Arrays Taylor FinnellAlex Schwartz