Upload
everett-hill
View
213
Download
0
Embed Size (px)
DESCRIPTION
Introduction Many-core systems will feature an extremely dynamic workload. An unpredictable sequence of different applications enter and leave the system at run-time. A run-time system manager is required to efficiently map an incoming application onto the system resources. 3
Citation preview
Smart Hill Climbing for Agile Dynamic Mapping in Many-Core Systems
Design Automation Conference(DAC), pp.1-6, May 29-June 7 2013, Austin, TX, USAM. Fattah, M. Daneshtalab, P. Liljeberg, J. Plosila
Reporter Hsuan-Ru Li
Outline
Introduction Related Work Definitions Smart First Node Selection Results And Analysis Conclusions
2
Introduction
Many-core systems will feature an extremely dynamic workload.
An unpredictable sequence of different applications enter and leave the system at run-time.
A run-time system manager is required to efficiently map an incoming application onto the system resources.
3
Introduction(cont.)
Central manager (CM) of the system decides on the appropriate node for each task.
The system performance is significantly influenced by the utilized mapping approach.
Consider a contiguous application mapping: Relatively close nodes. No fragmentation.
4
Introduction(cont.)
Finding a convex region of nodes is a polynomial, O(n3), problem.
CoNA method significantly decreased complexity.
5
Introduction(cont.)
CoNA starts from a first node and attempts to map the application tasks onto a set of contiguous nodes around it.
Select a first node leads to least fragmentation of remaining nodes.
Hill climbing search heuristic is adapted in order to find the optimum first node rapidly among all the available nodes.
6
Introduction(cont.)
Smart Hill Climbing: HiC with intelligence. n is the given network size. Best case: O(√n) Worst case: O(n2)
7
Outline
Introduction Related Work Definitions Smart First Node Selection Results And Analysis Conclusions
8
Related Work
Different first node or generally the mapping area selection methods. First node selection:
Nearest Neighbor (NN) Best Neighbor (BN)
Region selection: Incremental (INC) approach CoNA VIP-supported approach
9
Outline
Introduction Related Work Definitions Smart First Node Selection Results And Analysis Conclusions
10
Definitions
Consider a homogenous mesh-based NoC in definitions and experiments.
Define several evaluation metrics as assessment tools to compare different algorithms.
Mapping algorithms try to allocate system resources in an optimal way.
11
Definitions(cont.)
Each application in the system is represented by a directed graph denoted as a task graph Ap =TG(T, E), ti ∈ T , edge ei,j ∈ E, wi,j = amount of data transferred.
12
Definitions(cont.)
Architecture graph AG(N, L) is a simple M×M 2D-mesh NoC with the XY routing.
A set of nodes nx,y ∈N, connected together through communication links lk ∈ L.
Each node nx,y contains a 5-port router rx,y connected to the local processing element pex,y by its local port.
13
Definitions(cont.)14
Definitions(cont.)15
Definitions(cont.)
One-to-one mapping function from the set of application tasks T, to the set of NoC nodes N:
16
Definitions(cont.)
Mapping function is started if and only if there are enough available nodes to map onto them.
17
Definitions(cont.)
A task ti is mapped onto as nti and the packet corresponding to the edge ei,j as pcki,j
The packet sent from nti to ntj. The set of running applications on the system is also denoted by APPS.
|APPS| means number of running applications.
18
Definitions(cont.)19
Average Weighted Manhattan Distance (AWMD)
Definitions(cont.)
Congestion increases the network latency dramatically and also increases the network dynamic power consumption.
External congestion occurs when a network channel is contented by the packets of different applications.
Internal congestion is related to the packets of the same application.
20
Definitions(cont.)
To decrease congestion The mapped area of an application
should be as convex as possible and minimally fragmented.
The allocated nodes as a metric to assess the Mapped Region Dispersion.
21
Definitions(cont.)
The area with the smallest MRD is almost circular.
However, a circular region will generate more area fragmentation in long term.
The best mapped area would be square.
22
Definitions(cont.)
MRD of a square with |T| nodes
Normalized MRD metric
23
Definitions(cont.)
The NMRD value of 1 means a squared area.
NMRD increases Mapped area is getting more
fragmented. Less similar to a square shape.
24
Outline
Introduction Related Work Definitions Smart First Node Selection Results And Analysis Conclusions
25
Smart First Node Selection26
Approximate model estimates the number of available nodes in a square shape around a given node(First node).
Proposed model is utilized in the adapted hill climbing search hubristic Find the appropriate first node in an
agile and smart manner.
Smart First Node Selection(cont.)
27
Square factor of a given node, SF(ni,j), is the estimated number of contiguous, almost square-shaped, available nodes around that node.
Smart First Node Selection(cont.)
28
First find the largest square centered on ni,j, SQmax = (ni,j, rmax)
Some more nodes beyond the square borders not belonging to system rectangles, as marked with asterisk.
Smart First Node Selection(cont.)
29
The algorithm also calculates a direction, called open direction (openDir). Indicates one of the neighbors of the
node estimated to have a larger SF. An openDir of value zero
No specific direction is predicted to result in a larger square factor.
Smart First Node Selection(cont.)
30
SF calculation has a linear time complexity of O (|APPS|).
However, this will take O(M2|APPS|) time. (exhaustive search)
Smart First Node Selection(cont.)
31
Smart Hill Climbing(SHiC) starts from a randomly selected node and walks smartly through the network nodes.
This significantly reduces the amount of traversed nodes, resulting in an agile mapping algorithm.
Smart First Node Selection(cont.)
32
SHiC looks for the node with the optimum SF according to a preference function.(SF(ncur) < |T| AND SF(nnext) > SF(ncur)) OR(SF(nnext) ≥ |T| AND SF(nnext) < SF(ncur))
Smart First Node Selection(cont.)
33
Smart First Node Selection(cont.)
34
Find the Find the Largest Largest
SFSF
Smart First Node Selection(cont.)
35
Smart First Node Selection(cont.)
36
Smart First Node Selection(cont.)
37
Stochastic hill climbing approach is executed several times starting from different randomly chosen nodes.
The algorithm is repeated (2+√|APPS|) times (line 1).
The SHiC outer loop is executed O(√|APPS|) times, the while loop is executed O(M) times.
Smart First Node Selection(cont.)
38
SF(nnext) is calculated in each iteration in O(|APPS|).
The time complexity of the SHiC will be O(M×|APPS|3/2).
This is significantly faster than the exhaustive search.
Outline
Introduction Related Work Definitions Smart First Node Selection Results And Analysis Conclusions
39
Results And Analysis
Applications with 4 to 35 tasks are generated using TGG.
The communication volumes (wi,j) are randomly distributed between 2 to 16 flits of data.
SystemC many-core platform which utilizes a pruned version of Noxim, as its communication architecture.
40
Results And Analysis(cont.)
Different mapping and first node selection are evaluated.
The network size varying from 8×8 to 20×20 nodes.
A random sequence of applications is entered into the scheduler FIFO according to the desired rate, λ.
The maximum possible scheduling rate is called λfull.
41
Results And Analysis(cont.)
An allocation request for the scheduled application is sent to the CM of the platform residing in the node n0,0.
42
Results And Analysis(cont.)43
Results And Analysis(cont.)
In order to assess the square factor accuracy. Perform exhaustive search to select the
node. Network size: 16×16 . Applications enter the system with
0.8λfull rate.
44
Results And Analysis(cont.)45
Results And Analysis(cont.)
Search on the SF, exhaustive search is the best result.
SHiC on dispersion (NMRD) and power (AWMD) metrics show only 4% of increase.
SHiC approach significantly enhances the performance of the system under the same utilized mapping algorithm.
46
Results And Analysis(cont.)
CoNA/SHiC verse CoNA/NN case. Both the dispersion and power
dissipation of applications increase versus increase of λ.
SHiC keeps its preeminence over other approaches and outperforms by 10 to 30 percent.
47
Results And Analysis(cont.)
Study the λ effect on the system performance.
48
Results And Analysis(cont.)
CoNA/SHiC verse NN/NN case. λ is kept 0.8λfull. SHiC scales well as the network size
increases and keeps the system performance at the same level.
49
Results And Analysis(cont.)50
Outline
Introduction Related Work Definitions Smart First Node Selection Results And Analysis Conclusions
51
Conclusions
This algorithm utilized an approximate model which quickly estimates the available area around a given node.
The provided open direction aided the climbing algorithm to reach the optimum node faster by taking smart steps.
52
Conclusions(cont.)
Results emphasized the significant impact of convex mapping on congestion reduction of the network.
53
Thank You
54