6
Research and Practices on 3D Networks-on-Chip Architectures Amir-Mohammad Rahmani 1,2 , Khalid Latif 1,2 , Pasi Liljeberg 1 , Juha Plosila 1 , Hannu Tenhunen 1,2 University of Turku, Finland 1 . Turku Centre for Computer Science (TUCS), Finland 2 . {amir.rahmani, khalid.latif, pasi.liljeberg, juha.plosila, hannu.tenhunen}@utu.fi Abstract-To continue the growth of the number of transistors on a chip, the 3D IC practice, where multiple silicon layers are stacked vertically, is emerging as a revolutionary technology. Partitioning a larger die into smaller segments and then stacking them in a 3D integration can significantly reduce latency and energy consumption. Such benefits emanate from the notion that inter-wafer distances are negligible compared to intra-wafer distances which substantially reduce global wiring length in 3D chips. This progress has introduced novel architectures and new challenges for high-performance power-aware design exploration. In this paper, we outline the opportunities and challenges associated with three- dimensional networks-on-chip architectures, under consideration for different design metrics. In this context, we categorize and present several alternatives for 3D NoC architectures and we investigate and summarize the impact of these architectures on various system characteristics. I. I NTRODUCTION Communication plays a crucial role in the design and performance of multi-core systems-on-chip (SoCs). Recently on-chip transistor density has been considerably increased and this enables the integration of dozens of components on a single die. One outcome of greater integration is that interconnection networks have started to replace shared buses. Networks-on- chip (NoCs) [1][2] are proposed to be used in complex SoCs for communication between cores because they scale better than traditional forms of on-chip interconnections, and have better performance and power consumption characteristics [2]. The design of 2D NoCs has been examined from various aspects, such as performance, power and reliability [3][4][5][6][7][8] and some commercial products already deploy such networks [9][10]. The advent of three-dimensional (3D) stacked technologies provides a new horizon for on-chip interconnect design. 3D ICs, which contain multiple layers of active devices, have the potential for enhancing system power/performance characteristics [11][12][13][14][15]. 3D ICs allow for performance enhancements even in the absence of scaling [12][16]. This is because of the reduced length of interconnects. Besides this clear benefit, package density is increased significantly, power is reduced due to shorter wires, and circuitry is more immune to noise [12]. The power/performance improvement arising from the architectural advantages of NoCs will be significantly enhanced if 3D ICs are adopted as the basic fabrication methodology. The amalgamation of two emerging paradigms, NoC and 3D IC, allows for the creation of new structures that enable significant performance enhancements over more traditional solutions. With freedom in the third dimension, architectures that were prohibitive due to wiring constraints in 2D ICs are now possible, and many 3D implementations can outperform their 2D counterparts [11]. Even though both 3D integrated circuits and NoCs are proposed as alternatives for the interconnect scaling demands, there are several challenges of combining both approaches to design three-dimensional NoCs such as only a few commercially available EDA tools and lack of design methodologies, high peak temperatures, increased power densities and large area footprints of vertical interconnects. To address these issues new architectures and design methods are needed. In the last few years, there have been many efforts in interconnection network designs for 3D stacked CMPs. The purpose of this survey is to clarify the 3D NoC concept and to map the scientific efforts made into the area of architectural and topological optimizations in NoC research. We will identify general trends and explain a range of issues which are important for designing efficient 3D NoC architectures. Moreover, this survey highlights the significance of difference network characteristics (e.g. power dissipation, thermal issues, etc) in on-chip networks, and brings valuable insights and comparisons of different existing architectures in the area of 3D on-chip network designs. The rest of this paper is organized as follows: Section II covers basics and developments pertinent to 3D integrated circuits technologies. Section III provides details of the existing research and practices on 3D Networks-on-Chip designs and presents the impact of existing architecture on different system characteristics, while Section IV summarizes the survey. II. THREE-DIMENSIONAL NETWORKS-ON-CHIP In the last few years, there have been many efforts to design and fabricate 3D ICs for different applications. The designs include a NAND Flash using double stack S3 technology [17] form Samsung [18], 3D-OTP memory from Matrix [19], the DARPA 3D IC project [20] from MIT Lincoln Laboratories, and also IBM [21] and Tezzaron [22] recent projects. These 3D integrated circuits have emerged to overcome the limitations of interconnect scaling [23] by stacking active silicon layers [24][25]. Compared with traditional 2D design, 3D ICs offer a number of advantages: such as shorter global interconnects, higher performance, lower interconnect power consumption due to wire-length reduction, higher packing density and smaller footprint, and support for the implementation of mixed- technology chips [29][30]. In this context, several 3D designs have appeared recently. This section gives a brief introduction on the exploration of possible technologies Several vertical interconnect technologies have been explored, including wire bonding, microbump, contactless (capacitive or inductive), and through-silicon-via (TSV) vertical interconnect [13]. Among several 3D integration technologies, TSV is the most promising one, hence the majority of 3D integration R&D activities concentrate on this approach [23]. Utilizing 3D design offers increased bandwidth [33] and reduced length of average interconnection wire [34], which results in a considerable saving in overall power consumption. It has been also demonstrated that a 3D design can be utilized to improve reliability [35]. However, the adoption of a 3D integration technology faces the challenges of increasing chip temperature due to increasing power density compared to a planar 2D design [36]. In this context, several techniques have been proposed in 3D architectures such as physical design optimization through intelligent placement [37], increasing thermal conductivity of the stack through insertion of thermal vias [36], and use of novel cooling structures [38]. In addition, a recent work reveals that in placement of the processing cores in a 3D chip, the areal power density is the more significant design constraint [39]. Consequently, thermal concern can be managed as long as components with high power density are not stacked on top of each other. To prevent severe thermal problems, architectures that stack memory on top of processor cores, or 978-1-4244-8971-8/10$26.00 c 2010 IEEE

Document56

Embed Size (px)

Citation preview

Page 1: Document56

Research and Practices on 3D Networks-on-Chip ArchitecturesAmir-Mohammad Rahmani1,2, Khalid Latif1,2, Pasi Liljeberg1, Juha Plosila1, Hannu Tenhunen1,2

University of Turku, Finland1. Turku Centre for Computer Science (TUCS), Finland2.{amir.rahmani, khalid.latif, pasi.liljeberg, juha.plosila, hannu.tenhunen}@utu.fi

Abstract-To continue the growth of the number oftransistors on a chip, the 3D IC practice, where multiplesilicon layers are stacked vertically, is emerging as arevolutionary technology. Partitioning a larger die intosmaller segments and then stacking them in a 3Dintegration can significantly reduce latency and energyconsumption. Such benefits emanate from the notion thatinter-wafer distances are negligible compared to intra-waferdistances which substantially reduce global wiring length in3D chips. This progress has introduced novel architecturesand new challenges for high-performance power-awaredesign exploration. In this paper, we outline theopportunities and challenges associated with three-dimensional networks-on-chip architectures, underconsideration for different design metrics. In this context,we categorize and present several alternatives for 3D NoCarchitectures and we investigate and summarize the impactof these architectures on various system characteristics.

I. INTRODUCTION

Communication plays a crucial role in the design andperformance of multi-core systems-on-chip (SoCs). Recentlyon-chip transistor density has been considerably increased andthis enables the integration of dozens of components on a singledie. One outcome of greater integration is that interconnectionnetworks have started to replace shared buses. Networks-on-chip (NoCs) [1][2] are proposed to be used in complex SoCs forcommunication between cores because they scale better thantraditional forms of on-chip interconnections, and have betterperformance and power consumption characteristics [2]. Thedesign of 2D NoCs has been examined from various aspects,such as performance, power and reliability [3][4][5][6][7][8]and some commercial products already deploy such networks[9][10]. The advent of three-dimensional (3D) stackedtechnologies provides a new horizon for on-chip interconnectdesign. 3D ICs, which contain multiple layers of active devices,have the potential for enhancing system power/performancecharacteristics [11][12][13][14][15]. 3D ICs allow forperformance enhancements even in the absence of scaling[12][16]. This is because of the reduced length of interconnects.Besides this clear benefit, package density is increasedsignificantly, power is reduced due to shorter wires, andcircuitry is more immune to noise [12]. The power/performanceimprovement arising from the architectural advantages of NoCswill be significantly enhanced if 3D ICs are adopted as the basicfabrication methodology. The amalgamation of two emergingparadigms, NoC and 3D IC, allows for the creation of newstructures that enable significant performance enhancementsover more traditional solutions. With freedom in the thirddimension, architectures that were prohibitive due to wiringconstraints in 2D ICs are now possible, and many 3Dimplementations can outperform their 2D counterparts [11].

Even though both 3D integrated circuits and NoCs areproposed as alternatives for the interconnect scaling demands,there are several challenges of combining both approaches todesign three-dimensional NoCs such as only a fewcommercially available EDA tools and lack of designmethodologies, high peak temperatures, increased powerdensities and large area footprints of vertical interconnects. Toaddress these issues new architectures and design methods are

needed. In the last few years, there have been many efforts ininterconnection network designs for 3D stacked CMPs. Thepurpose of this survey is to clarify the 3D NoC concept and tomap the scientific efforts made into the area of architectural andtopological optimizations in NoC research. We will identifygeneral trends and explain a range of issues which are importantfor designing efficient 3D NoC architectures. Moreover, thissurvey highlights the significance of difference networkcharacteristics (e.g. power dissipation, thermal issues, etc) inon-chip networks, and brings valuable insights and comparisonsof different existing architectures in the area of 3D on-chipnetwork designs.

The rest of this paper is organized as follows: Section IIcovers basics and developments pertinent to 3D integratedcircuits technologies. Section III provides details of the existingresearch and practices on 3D Networks-on-Chip designs andpresents the impact of existing architecture on different systemcharacteristics, while Section IV summarizes the survey.

II. THREE-DIMENSIONAL NETWORKS-ON-CHIP

In the last few years, there have been many efforts to designand fabricate 3D ICs for different applications. The designsinclude a NAND Flash using double stack S3 technology [17]form Samsung [18], 3D-OTP memory from Matrix [19], theDARPA 3D IC project [20] from MIT Lincoln Laboratories,and also IBM [21] and Tezzaron [22] recent projects. These 3Dintegrated circuits have emerged to overcome the limitations ofinterconnect scaling [23] by stacking active silicon layers[24][25]. Compared with traditional 2D design, 3D ICs offer anumber of advantages: such as shorter global interconnects,higher performance, lower interconnect power consumption dueto wire-length reduction, higher packing density and smallerfootprint, and support for the implementation of mixed-technology chips [29][30]. In this context, several 3D designshave appeared recently. This section gives a brief introductionon the exploration of possible technologies

Several vertical interconnect technologies have beenexplored, including wire bonding, microbump, contactless(capacitive or inductive), and through-silicon-via (TSV) verticalinterconnect [13]. Among several 3D integration technologies,TSV is the most promising one, hence the majority of 3Dintegration R&D activities concentrate on this approach [23].

Utilizing 3D design offers increased bandwidth [33] andreduced length of average interconnection wire [34], whichresults in a considerable saving in overall power consumption.It has been also demonstrated that a 3D design can be utilized toimprove reliability [35]. However, the adoption of a 3Dintegration technology faces the challenges of increasing chiptemperature due to increasing power density compared to aplanar 2D design [36]. In this context, several techniques havebeen proposed in 3D architectures such as physical designoptimization through intelligent placement [37], increasingthermal conductivity of the stack through insertion of thermalvias [36], and use of novel cooling structures [38]. In addition, arecent work reveals that in placement of the processing cores ina 3D chip, the areal power density is the more significant designconstraint [39]. Consequently, thermal concern can be managedas long as components with high power density are not stackedon top of each other. To prevent severe thermal problems,architectures that stack memory on top of processor cores, or978-1-4244-8971-8/10$26.00 c©2010 IEEE

Page 2: Document56

those that rely on low-power processor cores have beendeveloped [40]. It should be noted at this point that increasedtemperatures increase wire resistances, and consequently theinterconnect delays [41].

III. 3D ARCHITECTURES

Recently, various architectures were proposed for 3D NoCswhich have different impacts on design metrics. This part isconcerned with the existing architectures for 3D NoC designand reviews their influences on different system characteristics.

i. SYMMETRIC NOC ARCHITECTUREIn order to integrate many nodes into a 3D chip, simplest

approach is to group the nodes into multiple layers and basicallystack them on top of each other as shown in Figure 1(a), whichshows 3 layers stacked together, each with 9 nodes, totaling 27nodes. We call this architecture a 3D Symmetric NoC, sinceboth intra- and inter-layer movement bear identicalcharacteristics: hop-by-hop traversal. Despite of simplicity, thisarchitecture has two major inherent drawbacks. Firstly, it doesnot exploit the beneficial attribute of a negligible inter-waferdistance (around 50 μm per layer) in 3D chips [42]. Sincetraveling in the vertical dimension is multi-hop, it takes thesame time as moving within the layer. In this architecture, inter-layer and intra-layer hops are indistinguishable, in spite of theaverage number of hops between a source and a destinationdoes decrease as a result of folding a 2D design into multiplestacked layers. In addition, buffering and arbitration delay ofeach flit at every hop, add to the overall delay for routing withinthe layer. Secondly, a larger 7×7 crossbar is obligated as a resultof two extra ports. According to [42], crossbars scale upwardvery inefficiently and it can be seen from their reported resultsthat compared with 5×5 crossbar, a 6×6 crossbar consumesabout 21% more power, and the power consumption of a 7×7crossbar is approximately 2.24 times more than the 5×5counterpart. Thus, 3D Symmetric NoC implementation is asomewhat naive extension to the baseline 2D network becauseof its excessive area and power overhead.

ii. 3D NOC-BUS HYBRID ARCHITECTUREOne of the main characteristics of a 3D IC is the short

interlayer distances [13]. In order to take advantage of thisapplicable attribute, 3D NoC-Bus Hybrid architecture wasproposed that is a hybrid between packet-switched network anda bus. Accordingly, with the potential of utilizing this benefit, asymmetric NoC architecture with multi-hop communication inthe inter-layer dimension is not efficient. In this architecture,given the very small inter-layer distance, single-hopcommunication is, in fact, feasible. As can be seen from Figure1(b), the NoC router can be hybridized with a bus link in thevertical dimension to create a 3D NoC-Bus Hybrid structure.This approach was first used in a 3D NUCA L2 Cache forCMPs [43]. This hybrid system provides both performance andarea benefits. It requires a 6×6 crossbar, since the bus adds asingle additional port to the generic 2D 5×5 crossbar andcompared to 7×7 crossbar in symmetric architecture, it is lesspower-hungry and occupies less area. The additional link formsthe interface between the NoC domain and the bus domain. Thebus link has its own dedicated queue, which is controlled by acentral arbiter. Flits from different layers wishing to moveup/down should arbitrate for access to the shared medium.Furthermore, each bus has only a small number of nodes,keeping overall capacitance on the bus small and considerablysimplifying bus arbitration.

However, despite this encouraging result, there is anopposite side of the coin which paints a rather bleak picture,because the bus approach also suffers from a followingdrawback. Since the bus is a shared medium, it does not allowconcurrent communication in the third dimension. Therefore, inhigh network loads, probability of contention and blockingcritically increases. As result of this, there is a considerabledegradation in inter-layer bandwidth despite single-hop verticalcommunication does improve performance in terms of overalllatency.

iii. CILIATED 3D MESH ARCHITECTUREAnother proposed method of constructing a 3D NoC is by

adding layers of functional IP blocks and restricting theswitches to one layer or a small number of layers. In thiscontext, Feero et al. [11], introduce new architecture calledciliated 3D Mesh. This structure is essentially a 3D Meshnetwork with multiple IP blocks per switch. For the ciliated 3DMesh, a 3 × 3 × 3 3D Mesh network with three IPs per switch,is shown in Figure 1(c). In a ciliated 3D Mesh network, eachswitch contains at most 5+k ports (one for each cardinaldirection, two for up and down (one either up or down in two-layer 3D mesh), and one to each of the k IP blocks. Inconsequence of multiple IP cores per switch and diminishedconnectivity this architecture presents lower overall bandwidthcompared to a symmetric 3D Mesh. Nonetheless, it was shownthat this type of network offers an advantage in terms of energydissipation, especially in the presence of specific trafficpatterns.

iv. TRUE 3D NOC ROUTERConcerning with drawbacks of the preceding architectures,

the implementation of a true 3D crossbar with the target ofseamless integration of the vertical links in the overall routeroperation, can be desirable. Based upon this envision, in [42] anefficient router structure was offered, the illustration of such a3D crossbar layout can be seen from Figure 1(d). It should benoted that the traditional definition of a crossbar - in the contextof a 2D physical layout - is a switch in which each input isconnected to each output through a single connection point.However, extending this definition to a physical 3D structurewould imply a switch of enormous complexity and size (giventhe increased numbers of input- and output port pairs associatedwith the various layers). In this architecture a simpler structurewas chosen which can accommodate the connection of an inputto an output port through more than one connection points.

The vertical links are embedded in the crossbar and extendto all layers. This implies the use of a 5×5 crossbar, since noadditional physical channels need to be dedicated for interlayercommunication. Interconnection between the various links in a3D crossbar would have to be provided by dedicated connectionboxes at each layer. These connecting points can facilitatelinkage between vertical and horizontal channels, allowingflexible flit traversal within the 3D crossbar. The improvedarchitecture (particularly on crossbar structure) of the True 3DNoC router called DimDe has been proposed which reveals therather enhanced energy-delay product characteristic [42].

Despite this encouraging result, there are some importantdrawbacks. Adding a large number of vertical links in a 3Dcrossbar to increase NoC connectivity leads to increased pathdiversity and means multiple possible paths between source anddestination pairs, and actually leads to a dramatic increase in thecomplexity and power consumption of the central arbiter.

Page 3: Document56

Figure 1. NoC architectures. (a) Three-Dimensional mesh. (b) NoC-Bus Hybrid mesh (c) Ciliated 3D mesh (d) NoC routers with true 3D crossbars

v. TREE-BASED 3D NOCSButterfly fat tree (BFT) [44], [45] and the generic fat tree, orSPIN [46] are the two types of tree-based interconnectionnetworks that have been considered for NoC applications.According to [11], considerable enhancements can be achievedwhen these networks are instantiated in a 3D IC environment.Unlike the work with mesh-based NoCs, any new topologies fortree-based systems were not proposed. Instead, they present theachievable performance benefits by instantiating already-existing tree-based NoC topologies in a 3D environment.

It can be concluded from their reported results that when the2D BFT network is mapped onto a multi-layer 3D SoC, wirerouting becomes simpler, and the longest interswitch wirelength is reduced by at least a factor of two, in comparison withthe one-layer 2D implementation. This will lead to reducedenergy dissipation as well as smaller area overhead. They arguethat the fat tree topology will have the same advantages whenmapped onto a 3D IC as the BFT.

vi. XNOTS 3D NOCTo make the best utilization of the short delay and high

density of inter-wafer links, Xbar-connected Network-on-Tiers(XNoTs), which consist of multiple network layers tightlyconnected via crossbar switches, is proposed by Matsutani et al.[47]. XNoTs-based architectures have crossbar switches thatconnect different layers and their cores, in such a way that the2D topology on every layer can be independently customized soas to meet the cost-performance requirements, as far as networkconnectivity is at least guaranteed with the bottom layer. Thearchitecture is not power-efficient, because it requires largevertical switches.

vii. MIRA 3D NOCPark et al. [48] propose a Multi-layered on-chip Interconnect

Router Architecture (MIRA), which is based on implementing a2D mesh chip-multiprocessor in three dimensions. Unlike theexplained 3D routers, MIRA is a 3D stacked NoC routerarchitecture which is stacked into multiple layers and optimizedto reduce the overall area requirements and power consumption.The major drawback of the architecture is that it assumes theprocessor cores are designed in 3D. This makes it difficult toreuse existing highly optimized 2D processor core designs.

viii. DE-BRUIJN GRAPH-BASED 3D NOCArchitectures like typical 3D mesh, ciliated 3D mesh or tree

based 3D NoC have a common disadvantage: the large networklatency because of large network diameter of mesh topology. In[49], Chen et al. propose a novel 3D NoC architecture based onDe-Bruijn graph. The De-Bruijn topology is an efficienttopology for parallel processing purposes. The advantages ofDe-Bruijn graph network topology are small diameter, highconnectivity and high reliability [50]. The degree of NoC basedon De-Bruijn graph does not change with an increase in the sizeof network.

This architecture benefits from a simple routing algorithm.The De-Bruijn architecture provides better throughputperformance as compared to the 3D mesh NoC because ofshorter diameter. On other hand, the solution is not powerefficient as compared to the 3D mesh NoC because shorterroute cannot be achieved in most cases. The proposed topologyis shown in Figure 2(a).

ix. SERIALIZED VERTICAL CHANNEL 3D NOCIn 3D ICs, vertical TSVs take significant chip area because

of their typically spread-out distribution. Pasricha [51]proposes the serialization of vertical TSV interconnects toreduce their area footprint and avoid the routing congestion ofinterconnects. Such serialization can lead to a better thermalTSV distribution resulting in lower peak temperatures. Theextra space made available on each layer due to serializationcan be used for efficient core layout across multiple layers androuting, as well as more efficient thermal TSV insertion fortemperature management. Such area savings and other benefitscome at the cost of nominal power and performance overhead.

Author has presented the impact on performance withvarying degrees of serialization. The performance degradationis about 1.7% on average for various applications for 4:1serialization (64à 16 wires) but reaches around 16.1% for 64:1serialization (64 à 1 wire). Performance degradation dependson the frequency of vertical transfers as well. The lower degreesof serialization like 4:1 or 2:1 are more practical because ofsmaller degradations in performance and lower powerconsumption overhead but significantly reducing the footprintarea.

Page 4: Document56

x. HONEYCOMB 3D NOCFor a topology design, there is always a tradeoff between

degree and diameter. The degree refers to the hardware cost.Mesh and torus are the main stream topologies because of theirhigh regularity, symmetry and scalability but with extrahardware cost (degree). Yin et al. [52] propose the honeycombinterconnect topology as an alternative for NoC based designs.Honeycomb topology reduces the network cost significantly,while maintaining the positive characteristics of typical meshand torus topologies.

For regular mesh and torus topologies in two dimensionaldomains, the honeycomb mesh and torus provide anapproximate 40% reduction in terms of network cost. Smallernetwork degree for honeycomb topology makes the routerarchitecture simpler, reliable and power efficient. The 3Dhoneycomb mesh topology with network degree ‘5’ is shown inFigure 2(b). To further reduce the network cost, vertical linkscan be removed by systematically bi-partitioning of the routersinto odd and even groups. The authors present the deadlock freerouting algorithm for 3D honeycomb topology as well.

xi. LOW-RADIX 3D NOCThe key problem faced by current 3D stacking technology is

that only vertical inter-layer links are allowed. Due to which,the direct connection between arbitrary nodes located atdifferent layers is not allowed. In case of 3D NoC architectures,the system design is highly constrained by the complexity andpower of routers and links. Thus, the low radix routers arepreferred due to lower power consumption and better heatdissipation. This makes the latency value higher due to high hopcounts in network paths.

Xu et al. [53] present an efficient network topology for 3DNoCs by using the long range links. As shown in Figure 2(c),utilization of long-range links makes the topology low diameterbut requires only low-radix routers to implement. The longrange links show significant reduction in latency even for thehigher operating frequency and pipelined wires. The increase inpower consumption is sub-linear to the increase in length. Theauthors present an optimal operating frequency of 1GHz for 3DNoCs because higher clock frequency for 3D chip brings theconcern of high heat dissipation.

xii. LAYER-MULTIPLEXED 3D NOCAs discussed above, architectures like 3D NoC-Bus hybrid

can reduce the network cost, but there are some otherperformance bottlenecks like bus bandwidth limitations.Another issue is the layer load-balancing. If one layer is

congested and other layers have very light load, the system isnot power and performance efficient.

Ramanujam et al. [54] present an efficient Layer-Multiplexed (LM) 3D architecture for vertical communicationwith the consideration of load balancing. More precisely, thelayer-multiplexed architecture replaces the one-layer-per-hoprouting in a conventional 3D mesh with simpler vertical de-multiplexing and multiplexing stages. There are two majordrawbacks for the proposed architecture. First is to traversepackets through two stage crossbar, which makes it lesspower and area efficient. Second is to use two hops perpacket for vertical communication, which makes thearchitecture less throughput efficient. On the other hand,reduced degree of router and layer load balancing compensatesthe power degradation.

xiii. BBVC-3D-NOCIn 3D NoCs, as the number of cores increases in each layer

to support increasing application complexity, the amount ofcommunication between layers is also expected to grow, andconsequently the number of interconnect TSVs will get higher.Since each TSV requires a pad for bonding to a wafer layer, thearea footprint of TSVs in each layer is no longer negligible. Asdiscussed before, serialization of vertical TSV interconnects isproposed as a way to reduces the interconnect TSV footprint.However, it degrades the performance due to serializationoverhead and low bandwidth utilization.

In [55], Rahmani et al. explore a mechanism to reduce TSVarea footprint, and thus improving 3D IC cost, routability,thermal efficiency, and power consumption. Specifically, theypropose a novel technique to replace the pair of unidirectionalvertical channels between layers by a bidirectional channel thatis dynamically self-reconfigurable to be used in either out-goingor incoming direction. To compensate the bandwidthdegradation, they exploit the low-latency nature of verticalTSVs by establishing high-speed inter-layer communicationusing mixed-clock FIFOs. Figure 2(d) shows the schematicrepresentation of their proposed Bidirectional BisynchronousVertical Channels (BBVC) -based NoC. The main idea of theproposed 3D NoC system is to exploit a bidirectional channelfor inter-layer communication operating at a higher frequencycompared to intra-layer communication (f2>f1) and beingcapable of dynamically changing the channel direction betweenrouters in neighboring layers based on the real time need ofbandwidth. The BBVC-3D-NoC can cope with the majorinherent drawbacks of the 3D symmetric NoC architecture byexploiting the beneficial attribute of a negligible inter-wafer

Page 5: Document56

distance in 3D chips. In addition, BBVCs are only responsiblefor inter-layer communication. In other words, the proposedinter-layer communication scheme is independent of the intra-layer topology. The main disadvantage of the BBVC-3D-NoCis that it does not support bus-based vertical communication,thus communication on vertical dimension is not single hop.

xiv. SPECIAL PURPOSE 3D NOCSDesign of 3D SoCs satisfying the application performance

requirements with minimum power consumption, whilesatisfying the 3D technology constraints is a big challenge. Asynthesis based power-performance efficient design approachfor 3D NoCs can deal with such issues.

Seiculescu et al. [56] present a tool for NoC topologysynthesis for 3D ICs named SunFloor 3D. Path computation,assignment and placement of network elements in 3D layers arealso the tasks for the tool. The separate algorithms for core toswitch connectivity and path computation are presented with thecorresponding constraints. The comprehensive comparisonbetween 2D and 3D NoCs is also presented, which shows that3D integration can significantly reduce the latency and powerconsumption as compared to the 2D interconnects. Thetopologies produced by SunFloor tool show significant powerand latency savings as compared to the standard topologies.Similar approach has been adopted by [57].

IV. SUMMARY

Recently, Networks-on-Chip architectures have gainedpopularity to address the interconnect delay problem fordesigning on-chip multi-core systems in deep sub-microntechnology. However, almost all prior studies have focused on2D NoC designs. Since three dimensional (3D) integration hasemerged to mitigate the interconnect delay and power problem,exploring the NoC design space in 3D can provide ampleopportunities to design high performance and energy-efficientNoC architectures. In this survey, we have given an overview ofthe existing architecture for 3D NoC and highlighted theirimpact on network characteristics. We have first stated themotivation for 3D NoC and given an introduction of the basicconcepts. Furthermore, we investigate various architecturalalternatives for designing a high-performance and energy-efficient 3D NoC system. We have demonstrated that besidesreducing the footprint in a fabricated design, 3D networkstructures provide better power consumption and performancecharacteristics compared to traditional, 2D NoC architectures.We have demonstrated that most NoC architectures are capableof achieving better power/performance when instantiated in a3D IC environment compared to more traditional 2Dimplementations. The pros and cons of the analyzed structuresare concisely summarized and reported in Table 1.

Page 6: Document56

V. REFERENCES

[1] L. Benini, and G. D. Micheli, “Networks on chips: a new SOC paradigm,”IEEE Computer, Vol. 35, No. 1, 2002, pp. 70–78.

[2] W. J. Dally, and B. Towles, “Route Packets, Not Wires: On-ChipInterconnection Networks,” in Proc. of Design Automation Conference,2001, pp. 684–689.

[3] J. Kim et al., “A Gracefully Degrading and Energy-Efficient Modular RouterArchitecture for On-Chip Networks,” in Proc. of 33rd InternationalSymposium on Computer Architecture, 2006, pp. 4–15.

[4] R. Mullins et al., “Low-Latency Virtual-Channel Routers for On-ChipNetworks,” in Proc. of 31st International Symposium on ComputerArchitecture, 2004, pp. 188-198.

[5] S. Heo and K. Asanovic, “Replacing global wires with an on-chip network: apower analysis,” in Proc. of the International Symposium on Low powerelectronics and design, 2005, pp. 369–374.

[6] R. Kumar et al., “Interconnections in multi-core architectures:Understanding mechanisms, overheads and scaling,” in Proc. of the 32nd

International Symposium on Computer Architecture, 2005, pp. 408–419.[7] L. Shang et al., “Thermal Modeling, Characterization and Management of

On-Chip Networks,” in Proc. of the 37th International Symposium onMicroarchitecture, 2004, pp. 67-78.

[8] R. Marculescu, “Networks-On-Chip: The Quest for On-Chip Fault-TolerantCommunication,” in Proc. of the IEEE Computer Society Annual Symposiumon VLSI, 2003, pp. 8-12.

[9] Arteris, http://www.arteris.com/.[10] STMicroelectronics Spidergon, http://www.st.com/stonline/.[11] B. S. Feero and P. P. Pande, “Networks-on-Chip in a Three-Dimensional

Environment: A Performance Evaluation,” IEEE Transactions onComputers, 2009, pp. 32-45.

[12] A. W. Topol et al., “Three-Dimensional Integrated Circuits,” IBM J.Research and Development, Vol. 50, No. 4/5, 2006, pp. 491.

[13] W. R. Davis et al., “Demystifying 3D ICS: The Pros and Cons of GoingVertical,” IEEE Design and Test of Computers, Vol. 22, No. 6, 2005, pp.498-510.

[14] Y. Deng et al., “2.5D System Integration: A Design Driven SystemImplementation Schema,” in Proc. of Asia and South Pacific DesignAutomation Conference, 2004, pp. 450-455.

[15] M. Ieong et al., “Three Dimensional CMOS Devices and IntegratedCircuits,” in Proc. IEEE Custom Integrated Circuits Conference, 2003, pp.207-213.

[16] V. F. Pavlidis and E. G. Friedman, “3-D Topologies for Networks-on-Chip,”IEEE Transactions on VLSI Systems., Vol. 15, No. 10, 2007, pp. 1081-1090.

[17] S. M. Jung et al., "High area efficient and cost effective double stacked S3(stacked single-crystal Si) peripheral CMOS SSTFT and SRAM celltechnology for 512M bit density SRAM," IEEE International ElectronDevices Meeting Technical Digest, 2004, pp. 265-268.

[18] S. M. Jung et al., "Three dimensionally stacked NAND flash memorytechnology using stacking single crystal Si layers on ILD and TANOSstructure for beyond 30nm node," IEEE International Electron DevicesMeeting Technical Digest, 2006, pp. 1-4.

[19] F. Li et al., "Evaluation of Si02 antifuse in a 3D-OTP memory," IEEETransactions on Device and Materials Reliability, Vol. 4, No.3, 2004, pp.416-421.

[20] T. W. Chen et al., "Thermal modeling and device noise properties of three-dimensional-SOI technology," IEEE Transactions on Electron Devices, Vol.56, No.4, 2009, pp. 656-664.

[21] K. Bernstein et al., "Interconnects in the Third Dimension: DesignChallenges for 3D ICs," in Proc. of Design Automation Conference, 2007,pp.562-567.

[22] R. S. Patti, “Three-Dimensional Integrated Circuits and the Future ofSystem-on-Chip Designs”, in Proc of IEEE, Vol. 94, No. 6, 2006, pp. 1214-1224.

[23] G. Philip et al., Handbook of 3D Integration. Wiley-VCH, 2008.[24] B. Black et al., "3D processing technology and its impact on iA32

microprocessors," in Proc. of the International Conference on ComputerDesign, 2004, pp. 316-318.

[25] T. Kgil et al., "PICOSERVER: Using 3D Stacking Technology to Enable aCompact Energy Efficient Chip Multiprocessor," in Proc. Conf.Architectural Support for Programming Languages and Operating Systems,ACM Press, 2006, pp. 117-128.

[26] Y. -F. Tsai et al., "Three-dimensional cache design exploration using3DCacti," in Proc. of the International Conference on Computer Design,2005, pp. 519-524.

[27] K. Puttaswamy and G. H. Loh, "Thermal Herding: MicroarchitectureTechniques for Controlling Hotspots in High-Performance 3D-IntegratedProcessors," in Proc. of the 13th High Performance Computer Architecture,2007, pp. 193-204.

[28] C. Ababei et al., "Placement and Routing in 3D Integrated Circuits," IEEEDesign & Test, Vol. 22, No. 6, 2005, pp. 520-531.

[29] A. Jantsch and H. Tenhunen. Networks on Chip. Kluwer AcademicPublishers, 2003.

[30] G. De Micheli and L. Benini. Networks on Chips. Morgan Kaufmann, SanFrancisco, CA, 2006.

[31] D. Shamik et al., "Technology, performance, and computer-aided design ofthree-dimensional integrated circuits," in Proc. of the InternationalSymposium on Physical design, 2004, pp. 108-115.

[32] P. Morrowet al., "Wafer-Level 3D Interconnects Via Cu Bonding," in Proc.of the 21st Advanced metallization Conference, 2004, pp. 125-130.

[33] C. C. Liu et al., "Bridging the processor-memory performance gap with 3DIC technology," Design & Test of Computers, Vol. 22, No. 6, 2005, pp. 556-564.

[34] J. W. Joyner et al., "A stochastic global netlength distribution for a three-dimensional system-on-a-chip (3D-SoC)," in Proc. of the InternationalASIC/SOC Conference, 2001, pp. 147-151.

[35] N. Madan and R. Balasubramonian, "Leveraging 3D Technology forImproved Reliability," In Proc. of International Symposium onMicroarchitecture, 2007, pp. 223-235.

[36] J. Cong and Y. Zhang, “Thermal via planning for 3-D ICs,” in Proc. of the2005 IEEE/ACM International Conference on Computer-aided design, 2005,pp. 745–752.

[37] B. Goplen and S. Sapatnekar, “Efficient thermal placement of standard cellsin 3D ICs using a force directed approach,” in Proc. of InternationalConference on Computer Aided Design, 2003, pp. 86–89.

[38] B. Dang et al., “Wafer-level microfluidic cooling interconnects for GSI,” inProc. of the IEEE International Interconnect Technology Conference, 2005,pp 180–182.

[39] W.-L. Hung et al., “Interconnect and Thermal-aware Floorplanning for 3DMicroprocessors,” in Proc. of International Symposium on QualityElectronic Design, 2006, pp. 98–104.

[40] B. Black et al., “Die stacking (3d) microarchitecture,” in Proc. of theInternational Symposium on Microarchitecture, 2006, pp. 469-479.

[41] L. P. Carloni et al., "Networks-on-Chip in Emerging InterconnectParadigms: Advantages and Challenges", in Proc. of the IEEE InternationalSymposium on Networks-On-Chip, 2009, pp. 10-13.

[42] J. Kim et al., “A novel dimensionally-decomposed router for on-chipcommunication in 3D architectures,” in Proc. of International Symposium onComputer Architecture, 2007, pp. 138-149.

[43] F. Li et al., “Design and management of 3D chip multiprocessors usingnetwork-in-memory,” In Proc. of Int. Symp. on Computer Architecture, pp.130–141, 2006.

[44] R. I. Greenberg and L. Guan, “An Improved Analytical Model forWormhole Routed Networks with Application to Butterfly Fat Trees,” inProc. of International Conference on Parallel Processing, 1997, pp. 44-48.

[45] C. Grecu et al., “A Scalable Communication-Centric SoC InterconnectArchitecture,” in Proc. of International Symposium on Quality ElectronicDesign, 2004, pp. 343-348.

[46] P. Guerrier and A. Greiner, “A Generic Architecture for On-Chip Packet-Switched Interconnections,” in Proc. of Design, Automation and Test inEurope Conference, 2000, pp. 250-256.

[47] H. Matsutani et al., “Tightly-coupled multi-layer topologies for 3-D NOCs,”in Proc. of International Conference on Parallel Processing, 2007, pp. 75-84.

[48] D. Park et al., “MIRA: A multi-layered on-chip interconnect routerarchitecture,” in Proc. of International Symposium on ComputerArchitecture, 2008, pp. 251-261.

[49] Y. Chen et al., "De Bruijn graph based 3D Network on Chip architecturedesign," in Proc. of International Conference on Communications, Circuitsand Systems, 2009, pp. 986-990.

[50] M. Hosseinabady et al., “Reliable network-on-chip based on generalized deBruijn graph,” in Proc. of International High Level Design Validation andTest Workshop, 2007, pp. 3-10.

[51] S. Pasricha, "Exploring serial vertical interconnects for 3D ICs," in Proc. ofDesign Automation Conference, 2009, pp. 581-586.

[52] A. W. Yin et al., "Explorations of Honeycomb Topologies for Network-on-Chip," in Proc. of International Conference on Network and ParallelComputing, 2009, pp.73-79.

[53] Y. Xu et al., "A low-radix and low-diameter 3D interconnection networkdesign," in Proc. of International Conference on High PerformanceComputer Architecture, 2009, pp.30-42.

[54] R. S. Ramanujam and B. Lin, “A Layer-Multiplexed 3D On-Chip NetworkArchitecture,” IEEE Embedded Systems Letters, Vol. 1, No.2, 2009, pp. 50-55.

[55] A. –M. Rahmani et al., “An Efficient 3D NoC Architecture UsingBidirectional Bisynchronous Vertical Channels,” in Proc. of 2010 IEEEComputer Society Annual Symposium on VLSI, 2010, pp. 452-453.

[56] C. Seiculescu et al., "SunFloor 3D: A tool for Networks on Chip topologysynthesis for 3D systems on chips," in Proc. of Design, Automation & Testin Europe Conference & Exhibition, 2009, pp. 9-14.

[57] S. Murali et al., “Synthesis of networks on chips for 3D systems on chips,”in Proc. of Asia and South Pacific Design Automation Conference, 2009, pp.242-247.