28
Analyzing Clock Trees Jeff Shabel QUALCOMM, Inc. [email protected] ABSTRACT Clock tree power continues to be a major contributor to dynamic and, to a lesser degree, static chip power. It is imperative for low-power designs to reduce clock tree power as much as possible. The introduction of Power Compiler makes it possible to drastically cut dynamic clock tree power. This paper shows how PrimeTime 2004.12 was used to obtain fairly accurate clock tree power estimates on a 130nm chip. On this chip, Power Compiler dynamic clock gating was applied to several blocks. This paper describes how much power was consumed by each component of the clock tree, including wire, pin, clock buffer and register components. By analyzing how these different components contribute to the overall clock tree power, it was possible to find ways to improve the library to achieve lower-power designs. In addition this analysis shows how much of the clock tree power was consumed at the leaf of the tree. In situations where the leaf of the tree consumes more power, Power Compiler can be used to achieve significant savings in dynamic clock tree power. This paper explains the potential savings achieved on a real chip by using Power Compiler and compares this information to real silicon data. The paper also compares two different chips, one that used Power Compiler and one that did not, and describes how power savings for two blocks were achieved with Power Compiler. In addition to analyzing clock tree power, this paper explains how to take advantage of the new PrimeTime 2004.12 feature, read_parasitics_load_locations, to assist with visualizing clock trees. The paper describes how Tcl scripts can be used to plot clock trees, critical paths, and clock tree power relative to a given floorplan. These scripts can provide the designer with a quick and intuitive way to analyze clock trees and to find any potential issues with them.

Analyzing Clock Trees

  • Upload
    stephmo

  • View
    216

  • Download
    2

Embed Size (px)

DESCRIPTION

Analyzing Clock Trees

Citation preview

  • Analyzing Clock Trees

    Jeff Shabel

    QUALCOMM, Inc.

    [email protected]

    ABSTRACT Clock tree power continues to be a major contributor to dynamic and, to a lesser degree, static chip power. It is imperative for low-power designs to reduce clock tree power as much as possible. The introduction of Power Compiler makes it possible to drastically cut dynamic clock tree power. This paper shows how PrimeTime 2004.12 was used to obtain fairly accurate clock tree power estimates on a 130nm chip. On this chip, Power Compiler dynamic clock gating was applied to several blocks. This paper describes how much power was consumed by each component of the clock tree, including wire, pin, clock buffer and register components. By analyzing how these different components contribute to the overall clock tree power, it was possible to find ways to improve the library to achieve lower-power designs. In addition this analysis shows how much of the clock tree power was consumed at the leaf of the tree. In situations where the leaf of the tree consumes more power, Power Compiler can be used to achieve significant savings in dynamic clock tree power. This paper explains the potential savings achieved on a real chip by using Power Compiler and compares this information to real silicon data. The paper also compares two different chips, one that used Power Compiler and one that did not, and describes how power savings for two blocks were achieved with Power Compiler. In addition to analyzing clock tree power, this paper explains how to take advantage of the new PrimeTime 2004.12 feature, read_parasitics_load_locations, to assist with visualizing clock trees. The paper describes how Tcl scripts can be used to plot clock trees, critical paths, and clock tree power relative to a given floorplan. These scripts can provide the designer with a quick and intuitive way to analyze clock trees and to find any potential issues with them.

  • SNUG Boston 2005 Analyzing Clock Trees 2

    Table of Contents 1.0 Introduction......................................................................................................................... 4 2.0 How is clock tree power measured now? ........................................................................... 4 2.1 Static measurements............................................................................................................ 4 2.1.1 PrimePower..................................................................................................................... 4 2.1.2 With real silicon.............................................................................................................. 4 2.2 Dynamic measurements ...................................................................................................... 5 2.2.1 PrimePower..................................................................................................................... 5 2.2.2 With real silicon.............................................................................................................. 5 3.0 Using PrimeTime to calculate clock tree power ................................................................. 5 3.1 Overview............................................................................................................................. 5 3.2 Details of the PrimeTime Tcl script.................................................................................... 8 4.0 Static analysis results .......................................................................................................... 9 4.1 Ungated clock tree power ................................................................................................... 9 4.2 Gated clock tree power ....................................................................................................... 9 5.0 Correlation to silicon ........................................................................................................ 11 5.1 Setup and method.............................................................................................................. 11 5.2 Ungated clocks.................................................................................................................. 11 5.3 Gated clocks...................................................................................................................... 12 5.4 Dynamic clock savings ..................................................................................................... 13 6.0 Power plotting................................................................................................................... 13 6.1 Goals and setup................................................................................................................. 13 6.2 Sample results ................................................................................................................... 14 7.0 Clock tree plotting ............................................................................................................ 14 7.1 Goals and setup................................................................................................................. 14 7.2 Sample results ................................................................................................................... 14 8.0 Conclusion ........................................................................................................................ 15 9.0 Acknowledgements........................................................................................................... 16 10.0 References......................................................................................................................... 16 11.0 Appendix........................................................................................................................... 16 11.1 Perl script to preprocess .lib file ....................................................................................... 16 11.2 Clock Analyzer Tcl script ................................................................................................. 18

  • SNUG Boston 2005 Analyzing Clock Trees 3

    Table of Figures Figure 3-1 Clock tree power components....................................................................................... 5 Figure 3-2 Clock Buffer .lib extraction example........................................................................... 6 Figure 3-3 Register .lib extraction example.................................................................................... 7 Figure 3-4 Tcl script defining .lib power numbers ......................................................................... 7 Figure 6-1 Power plot example.................................................................................................... 14 Figure 7-1 Clock tree plotting example ....................................................................................... 15

  • SNUG Boston 2005 Analyzing Clock Trees 4

    1.0 Introduction Clock tree power is a significant contributor to overall dynamic chip power. Industry-wide averages indicate that 40 to 50% of dynamic chip power comes from the clock tree1. Analyzing the components that make up clock tree power is important. The ultimate goal is to identify the most important components of clock tree power so that designers can concentrate on those areas to reduce clock tree power on future chips. Another goal is to evaluate the effect of using Power Compilers clock gating feature on a design. If most of the clock tree power is at the leaf of the tree, then Power Compiler clock gating will have a major impact on reducing clock tree power. The clock tree analysis described in this paper was done entirely within PrimeTime. While PrimePower can do some of this analysis, PrimePower does not have the flexibility to analyze different aspects of clock tree power, as can be done with simple Tcl scripts inside PrimeTime. This paper also describes how to utilize a new PrimeTime feature, read_parasitics_load_locations, which can be used to help view critical paths and clock trees. 2.0 How is clock tree power measured now? Existing Synopsys tools provide various methods to measure clock tree power. This section discusses the strengths and shortcomings of each method and the tools that are in use today. 2.1 Static measurements 2.1.1 PrimePower PrimePower has the ability to calculate clock tree power and break down the components into several categories. However, there are a few improvements needed for reporting and debugging, which are not currently supported. First, PrimePower requires a separate license from PrimeTime. While many companies have PrimeTime licenses, not all companies have a PrimePower license nor do they have people sufficiently experienced with the tool to use it effectively. Second, PrimePower does not consider the internal register power that is consumed when only the clock pin is toggling. Third, PrimePower does not have the flexibility to quickly analyze clock trees starting and ending at specific points as required by the user. Fourth, PrimePower does not have the ability to extract additional useful clock tree power statistics, which is shown later in this paper. 2.1.2 With real silicon To measure clock tree power with real silicon, the design must provide an easy way to turn on and off separate clock trees at their source while holding the design in some sort of reset state. If this mechanism is provided, the power measurement simply involves measuring the current before and after the clock is turned on. The difference between the two power measurements is the amount of clock tree power consumed. Because it is difficult to create an effective measurement setup with silicon, it is suggested that a sanity check that correlates clock tree power results to some static predictions be performed.

  • SNUG Boston 2005 Analyzing Clock Trees 5

    It is nearly impossible to bound the best and worst case clock tree power numbers resulting from Power Compiler clock gating cells. Even if a design can be held in a reset state while measuring clock tree power, it is not known what percentage of the power compiler clock gating cells will be in a gating state and what percentage will be in a non-gating state. Ideally, it is preferable to bound the clock tree power with a maximum and minimum value depending on the gating state of the clock gating cells. This bounding currently cannot be done using real silicon. 2.2 Dynamic measurements 2.2.1 PrimePower PrimePower requires a SAIF file based on some simulation of a real-life scenario. The SAIF file would need to encompass all clock trees. The other option is to create multiple SAIF files, one per clock regime, to help isolate certain clocks. Generating SAIF files that reflect real functionality can sometimes be difficult. 2.2.2 With real silicon During real chip operation, it can be difficult to isolate clock tree power from combinational logic switching power. Even if measurements can be taken, they must be correlated to some other data (from PrimePower or other estimated analysis, for example) as a sanity check. 3.0 Using PrimeTime to calculate clock tree power 3.1 Overview

    Figure 3-1 Clock tree power components

    Clock tree power is calculated by taking into account the components shown in Figure 3-1. The resulting general formula for clock tree power consumption is: Clock Tree Power = Power(Cint_buffers) + Power(Cint_leaf_cells) + Power(Cwire) + Power(Cpin) PrimeTime can provide the wire and pin capacitances required to calculate total power. PrimeTime will not provide the internal switching power of the buffers and leaf cells that comes from the clock lines toggling. This information needs to be extracted from the .lib file for the

  • SNUG Boston 2005 Analyzing Clock Trees 6

    standard cell library. The internal_power tables need to be parsed for buffers and leaf cells. This information is then fed into PrimeTime to complete the calculation. The internal switching power of clock tree buffers is a component of two values: input transition time and output load. It is assumed that the input transition time and output load of clock tree buffers is fairly tight (and consistent) across the clock tree. If this is the case, notice that the internal switching power of a clock tree buffer does not change much within the range of acceptable transition times and output loads for most clock tree synthesis (CTS) settings. Because of this phenomenon, it is reasonable to choose an average power value from the internal_power table inside the .lib file of the standard cell library. An example is shown in Figure 3-2. Note that the values in red were chosen by the preprocessing Perl script to feed into PrimeTime. These values were chosen because they are near the center of the power table. power_lut_template (clock_buffer1_energy_template_0) { variable_1 : input_transition_time ; variable_2 : total_output_net_capacitance ; index_1 ( "0.1,0.25,0.3,0.45,0.5,2.0" ); index_2 ( "0,5,25,50,90,340,2000" ); } cell (clock_buffer1) { [snip] pin (z) { [snip] internal_power () { related_pin : "a" ; fall_power (clock_buffer1_energy_template_0) { values ( "77,77,78,79,79,80,80",\ "76,76,77,78,78,83,80",\ "75,75,76,77,77,79,79",\ "76,76,76,76,77,80,80",\ "78,77,77,78,78,79,81",\ "85,85,84,83,84,85,86" ); } rise_power (clock_buffer1_energy_template_0) { values ( "80,80,81,81,82,81,82",\ "79,79,80,80,81,78,79",\ "77,77,78,78,78,78,79",\ "79,79,78,77,76,74,75",\ "81,82,84,86,88,71,71",\ "87,87,86,86,86,88,90" ); } } } }

    Figure 3-2 Clock Buffer .lib extraction example The same principle applies to registers. Internal register power depends only on input transition time. Because transition times are fairly sharp after CTS, it is reasonable to choose an average (or best guess) value from the .lib file of the standard cell library. An example is shown in Figure 3-3. Note that the values in red were chosen by the preprocessing Perl script to feed into PrimeTime. These values were chosen because they are near the center of the power table.

  • SNUG Boston 2005 Analyzing Clock Trees 7

    power_lut_template (reg1_energy_template_0) { variable_1 : input_transition_time ; index_1 ( "0.1,0.2,0.3,0.4,0.8,1.5" ); } cell (reg1) { [snip] pin (clk) { direction : input ; capacitance : 5.0; clock : true ; internal_power () { fall_power (reg1_energy_template_0) { values ( "26,26,26,26,26,27" ); } rise_power (reg1_energy_template_0) { values ( "25,25,25,25,25,26" ); } } } [snip] }

    Figure 3-3 Register .lib extraction example At QUALCOMM, we use our own standard cell library. However, this same principle can be applied to the TSMC standard cell library as well. A Perl script can be used to preprocess the .lib files and write a Tcl script to read into PrimeTime. This Tcl script defines a new user attribute for each clock buffer and register in the standard cell library to store these values. Note that the Perl script must add both the rise and fall power and supply the summed value to PrimeTime. The summed value represents the consumed power during one clock cycle. A sample portion of the resulting Tcl script is shown in Figure 3-4. define_user_attribute -type float -classes lib_cell total_power set_user_attribute [get_lib_cells std_cell_library/inv_a] total_power 38 set_user_attribute [get_lib_cells std_cell_library/inv_b] total_power 40 set_user_attribute [get_lib_cells std_cell_library/inv_c] total_power 44 set_user_attribute [get_lib_cells std_cell_library/inv_d] total_power 50 set_user_attribute [get_lib_cells std_cell_library/inv_e] total_power 53 set_user_attribute [get_lib_cells std_cell_library/inv_f] total_power 60

    Figure 3-4 Tcl script defining .lib power numbers The Tcl script provides PrimeTime with the additional information necessary to calculate clock tree current consumption. A Tcl script can be written to traverse the clock tree, computing clock tree power as it traverses, until it reaches a leaf cell. A leaf cell is typically a register but can also be a memory element, custom block, or a random logic gate. While PrimeTime is traversing the tree, the Tcl script can save various statistics that can be used after PrimeTime completes the traversal. First, the script can optionally stop at power-compiler clock gating cells (CGCs). If the Tcl script is run twice on the same clock, stopping once at CGCs, and another time traversing through them, it is possible to see the maximum effect of Power Compiler on clock tree power. In one case, Power Compiler is calculating the clock tree power assuming that the CGCs are in a gating state. In the other case, it is calculating the clock tree power assuming that the CGCs are in a non-gating state. The difference between these two

  • SNUG Boston 2005 Analyzing Clock Trees 8

    values represents the maximum dynamic current savings due to power compiler clock gating. In reality, the real dynamic current consumed by a clock tree will be somewhere between these two values. The dynamic current will also depend on how often the clock-enables are active. How often the clock-enables are active is completely design dependent. Second, the script can track how much current is consumed at the leaf of the tree. In this paper, power consumed at the leaf of the tree is computed by summing the currents due to the final wire and pin caps after the last buffer (or CGC), as well as the final leaf cell. If a majority of the clock tree power comes from the leaf of the tree, Power Compiler will be extremely useful in saving clock tree power. However, if a majority of the clock tree power comes from higher up in the tree, Power Compiler would not be very effective in gating off clock tree power. Third, the script can keep track of how much current is consumed by various components of the clock tree:

    Internal register power, due solely to clock pin toggling Internal clock buffer switching power Wire capacitance Pin capacitance Clock gating cells switching power Internal memory power, due solely to clock pin toggling Power from miscellaneous non-clock buffer cells in tree

    With this information, it should be easy to see which areas of the clock tree should be evaluated to save the most power. 3.2 Details of the PrimeTime Tcl script The complete PrimeTime Tcl script is provided in the Appendix for reference. This section describes how the script works. The script does the following in the order listed:

    1. Prerequisite: Source the Tcl script generated from preprocessing the standard cell .lib file. 2. The user provides the start point of the clock tree. 3. Traverse the tree recursively, continuing only if the script finds a legal clock tree library

    cell. 4. Optionally stop at power compiler gating cells. 5. Record all components and their power contributions along the way, and also the power

    contributions of components at each level of the tree. 6. For each leaf traversed, record the power consumed:

    a. Include the last wire and pin caps. b. Include the final internal switching power of the leaf cell.

    7. When traversal is complete, report final power statistics.

  • SNUG Boston 2005 Analyzing Clock Trees 9

    4.0 Static analysis results 4.1 Ungated clock tree power Table 4-1 lists the entire clock tree power results for five 130nm chips and two 90nm chips. It assumes that all Power Compiler and manually-instantiated gating cells are in a non-gating state.

    Chip 1 - 130nm Chip 2 - 130nm Chip 3 - 130nm Chip 4 - 130nm Chip 5 - 130nm Chip 6 - 90nm Chip 7 - 90nmCurrent Source % of Total % of Total % of Total % of Total % of Total % of Total % of TotalMisc 0% 0% 0% 0% 0% 2% 2%Memory 0% 0% 1% 1% 1% 0% 0%CGC (int) 4% 3% 5% 4% 4% 1% 1%Pin 10% 10% 9% 9% 9% 10% 10%Clock Buffer (Int) 15% 15% 14% 16% 16% 14% 14%Wire 18% 24% 25% 25% 24% 24% 23%Register (Int) 52% 48% 47% 45% 46% 49% 50%Last Stage 68% 69% 70% 69% 70% 70% 70%

    Table 4-1 Ungated clock tree power for five 130nm chips and two 90nm chips Table 4-1 shows two very significant trends. First, note that regardless of the chip or the technology, roughly 70% of the clock tree power comes from the last stage, that is, the last net and leaf cell. This indicates that using Power Compiler will be very beneficial and that it should be run on these chips. Second, note that roughly 45 to 50% of the clock tree power is due to register power and only 15% is due to clock tree buffer power. Therefore, while it always helps to improve clock tree buffer cell designs, improving the register design could reduce overall dynamic power consumption.

    4.2 Gated clock tree power The results in Table 4-2 show the potential maximum power savings for each component of the clock tree on one 130nm chip due to using Power Compiler.

    Current source

    Maximum current savings per component using Power Compiler

    Misc 0%Memory 0%CGC 11%Pin 26%CBUF 13%Wire 31%Register 33%Total Savings 28%

    Table 4-2 Maximum current savings using Power Compiler

  • SNUG Boston 2005 Analyzing Clock Trees 10

    As discussed earlier in this paper, the maximum current savings is the difference in current, measured when all clock gating cells are in a non-gating state and when they are in a gating state. This value represents the maximum potential savings due to Power Compiler. Keep in mind that the actual savings will depend on how often each CGC is gated off and on. Also note that Power Compiler was not run on a large portion of the chip. It was run on the blocks with the fastest clocks but not on many others, mostly due to tool issues on (now) older versions of Design Compiler. Only 27% of the registers in the chip were synthesized using Power Compiler. Not all of registers in the 27% were successfully gated using CGCs. Clock gating was done on 66% of those registers, which equates to 18% of the total registers in the chip. So, even with only 18% of the registers successfully gated off, Power Compiler can save us up to 28% on our clock tree power. Again, note that the main reason for the high power savings is that the highest-speed blocks were using this feature. The medium-to-slower speed blocks were not able to use Power Compiler. This is an important point. Even if Power Compiler cannot be run on all the blocks in a chip, it is imperative that Power Compiler be used on the highest-speed blocks to maximize the potential savings. Table 4-3 shows various gating statistics for 15 different clock trees that were synthesized using Power Compiler.

    Clock name

    Maximum power

    savings (%)

    Gated registers (%)

    Average CGC

    fanout Median CGC

    fanout

    clock1 27% 40% 32 28 clock2 33% 54% 26 18 clock3 35% 59% 31 32 clock4 42% 58% 25 32 clock5 44% 70% 27 29 clock6 45% 56% 43 34 clock7 47% 55% 35 30 clock8 48% 77% 17 14 clock9 55% 86% 20 14 clock10 57% 77% 32 30 clock11 57% 72% 53 32 clock12 62% 72% 31 32 clock13 66% 84% 43 16 clock14 67% 90% 21 26 clock15 75% 88% 31 32

    Table 4-3 Gating statistics for several clock regimes

  • SNUG Boston 2005 Analyzing Clock Trees 11

    The goal of gathering the statistics shown in Table 4-3 was to find a correlation between maximum power savings and some other gating metric. While the power savings generally correlates to the percentage of gated registers, the power savings do not always follow this correlation. By evaluating the average and median CGC fanout, the expectation was to see a definite strong trend between power savings and gated registers. This was not the case. It is useful to note the wide range in power savings from clock regime to clock regime. Some savings were as high as 75%, while others were as low as 27%. In general, the clocks on this chip averaged 45 to 50% maximum power savings on their clock trees. 5.0 Correlation to silicon Static clock tree power analysis is meaningless without correlation to silicon. This section shows how the correlation between clock tree power and the static analysis described in this paper was achieved with silicon.

    5.1 Setup and method Clock tree power is measured on silicon by configuring software registers to turn on and off individual clock trees. Current measurements are made before and after the clock tree is turned on. The difference between the two measurements is assumed to be entirely due to clock tree power. During this measurement, software is holding all the blocks in a reset state to ensure that most of the random logic is stable when the clocks are turned on. The clock trees that were not synthesized using Power Compiler can be easily correlated to static measurements. The clock trees that were synthesized using Power Compiler are more difficult to correlate. These clock trees have clock gating cells and it is not known which gating cells are in a non-gating state and which cells are in a gating state while the block is being held in reset. It is completely design dependent. Therefore, static clock power predictions can only bound the power when Power Compiler is used. Clock tree power as measured on silicon should be within the bounds predicted by PrimeTime. The current measurements have a margin of error due to the precision of the measuring device in the lab. For smaller clock regimes, the margin of error can be very close to the actual measured current value. Therefore, it is necessary to consider the measurement error when correlating clock tree power to static measurements, as noted in the following sections. 5.2 Ungated clocks Table 5-1 shows the correlation to clocks that were not synthesized using Power Compiler.

  • SNUG Boston 2005 Analyzing Clock Trees 12

    Clock name

    % Difference PT vs. Silicon

    Silicon measurement

    margin of error

    clock1 -1% 0% clock2 8% 1% clock3 -3% 0% clock4 -12% 6% clock5 4% 0% clock6 3% 1% clock7 7% 4% clock8 9% 4% clock9 10% 11% clock10 11% 11% clock11 7% 11% clock12 18% 4% clock13 16% 25% clock14 4% 1%

    Table 5-1 Ungated Clock Correlation to Silicon Note that almost all clocks are within 10% of the expected value predicted by PrimeTime. The clocks that are outside the 10% range could be due to random logic that is not held in reset and is toggling with the clock. 5.3 Gated clocks Table 5-2 shows the correlation to clocks that were synthesized using Power Compiler.

    Clock

    Silicon measuremen

    t PT

    gated PT

    ungated

    Silicon measurement

    margin of error clock1 38 0 100 1 clock2 -2 0 100 1 clock3 -3 0 100 7 clock4 31 0 100 0 clock5 27 0 100 3 clock6 81 0 100 2 clock7 43 0 100 36 clock8 80 0 100 7 clock9 86 0 100 17 clock10 8 0 100 174 clock11 90 0 100 9 clock12 56 0 100 6 clock13 -9 0 100 10 clock14 26 0 100 0

    Table 5-2 Gated clock correlation to silicon

  • SNUG Boston 2005 Analyzing Clock Trees 13

    The clock tree current measurements listed in Table 5-2 are normalized to a minimum and maximum value of 0 and 100. Note that almost all clocks fall into the maximum/minimum range when taking into account the measurement margin of error. 5.4 Dynamic clock savings In order to evaluate the real effect of using Power Compiler, it is necessary to run a real life application on silicon and measure the results for a block, one on a chip that used Power Compiler and one on a chip that did not. This comparison was done on two blocks using two different chips with the same RTL, technology, tools (but not tool versions), and most library cells. The tool versions were updated to a later version for the chip that used Power Compiler, which might cause a slight difference in the dynamic power results when logic is on. Other than that, the only main difference was that one chip used Power Compiler clock gating and the other chip did not.

    Block

    PT maximum predicted CT

    savings Silicon savings

    Cell count difference Comments

    block1 38% 40% -8.5% Savings higher probably also due to cell count decrease block2 / mode1 39% block2 / mode2 44% 31%

    +4%

    Table 5-3 Dynamic clock savings from Power Compiler on silicon

    For both of these blocks, note the significant savings due to Power Compiler clock gating. The power measurements were taken once after each block was set up, just before the blocks were run. The power measurements were taken again while each block was running, doing real work. The difference between these two numbers is what is being compared between these two chips. This data provides proof that real dynamic clock tree power savings can be accomplished by using Power Compiler. 6.0 Power plotting 6.1 Goals and setup PrimeTime, starting in version 2004.12, features a new option for reading in the parasitics file: read_parasitics_local_locations. This new option provides the capability to trace through the design and create visual plots more easily than before. Previously, a user had to preprocess the SPEF or DEF files and create attributes for PrimeTime, to give it the X,Y locations of each cell. One application of this option is to create power plots of the chip that show clock tree power consumption. With this capability, a user can quickly tell where the most power in the chip is being consumed due to a clock tree. Furthermore, since this is done using a Tcl script, the user has the flexibility to control things such as stopping (optionally) at clock gating cells. This can be beneficial when analyzing clock tree power. Because clock tree power is roughly 40 to 50%

  • SNUG Boston 2005 Analyzing Clock Trees 14

    of the dynamic power on a chip, this analysis can be used as an indication of IR drop issues caused by the clock tree. The actual plotting is done using Gnuplot 4.0+. A Tcl script can be written to write Gnuplot commands to generate the plots of interest. A sample Tcl script is provided in the Appendix. 6.2 Sample results A sample power plot picture is shown in Figure 6-1.

    Figure 6-1 Power plot example

    The Tcl script creates bins for a small area of a chip. The total clock tree power consumed within that area is summed up and normalized with the rest of the bins. Bins with higher clock tree power are represented by orange and red colors. Bins with lower clock tree power are indicated by green and blue colors It is possible to annotate the top-level floorplan to the plot as well. This can be done by using PrimeTime or Physical Compiler, depending on the particular design flow. If PrimeTime is used for flat timing analysis (that is, no hardmacros, ETMs, or ILMs), Physical Compiler should be used to extract the hardmacro boundaries. 7.0 Clock tree plotting 7.1 Goals and setup There are other tools that can visually depict clock trees. However, using PrimeTime provides more flexibility. With PrimeTime, any start point can be specified for a clock tree. Users can also stop the clock tree transversal at any particular level of the tree or at clock gating cells. It is helpful to have this flexibility in PrimeTime since many users already use this tool for final chip analysis. 7.2 Sample results A sample clock tree is shown in Figure 7-1.

  • SNUG Boston 2005 Analyzing Clock Trees 15

    Figure 7-1 Clock tree plotting example

    Note that the real routes of the clocks are not shown. A straight connection is made using Gnuplot to connect the buffers in the tree. 8.0 Conclusion Clock trees typically consume 40 to 50% of the dynamic power of a chip. Analysis shows that most of that clock tree power, upwards of 70%, is at the leaf. In fact, 45 to 50% of the clock tree power comes from the internal switching power of the registers. The data shows that Power Compiler should, and does, help reduce clock tree power significantly. The data also identifies which areas of the clock tree should be considered for improvement to get the most bang for the buck. First, it is imperative that the placement of the last buffer (or gating cell) be optimized with respect to the leaf cells of the tree. The closer together that the last buffer and leaf are, the greater are the power savings that can be achieved. Second, considerable effort should be placed on improving the internal switching power of the registers. If this improvement comes at the cost of performance, then it might be feasible to have two types of registers, one with less dynamic power consumption but poorer performance, and one with better performance but more dynamic power consumption. With these two register types, synthesis tools should be able to choose the appropriate register to meet the design constraints. If the synthesis tools cannot handle this trade-off, external scripts can be written to perform the necessary register swaps where needed. If scripts are used, it is beneficial if the two register types have identical footprints so that a cell swap can be performed easily without affecting placement.

  • SNUG Boston 2005 Analyzing Clock Trees 16

    9.0 Acknowledgements I would like to thank Iain Finlay of QUALCOMM, Inc. for guiding me along the path of clock tree power analysis. Without both his early analysis of clock tree power for previous chips and his guidance, this analysis would never have been done. I would also like to thank both Elisabeth Moseley and Geoffrey Suzuki of Synopsys for their help in researching and writing this paper. 10.0 References [1] Chun, K. and Ling, A. Placement approach cuts SoC power needs. EE Times, 11/21/03 http://www.eetimes.com/story/OEG20031121S0035 [2] Synopsys PrimeTime User Guide, Version 2004.12, 2004. 11.0 Appendix 11.1 Perl script to preprocess .lib file #!/bin/perl $lib_fname = "mylibrary.lib "; $library_name = "yourlibraryname"; $outfile = "set_attribute_library.tcl"; open (INFILE,$lib_fname) || die "Cannot open $lib_fname for reading\n"; open (OUTFILE,">$outfile") || die "Cannot open $outfile for writing\n"; print OUTFILE "define_user_attribute -type float -classes lib_cell total_power\n"; while () { if (/^\s+cell \(([^\)]+)\)/) { $cellname = $1; # For registers, extract here. if ($cellname =~ /DFF/) { while () { # Clock pin of register here. if (/^\s+pin \(clock/) { # # Now Find internal power sections.. # Make sure to use the one without a when: clause.. # We're lucky because it's always the last one # This will take some tweaking to get right for every .lib file. # while () { if (/internal_power/) { $_ = ; $_ = ; # values line

  • SNUG Boston 2005 Analyzing Clock Trees 17

    /values \( "[0-9\.]+,[0-9\.]+,[0-9\.]+,([0-9\.]+)/; $fall_power = $1; $_ = ; # } line $_ = ; # rise_power line $_ = ; # values line /values \( "[0-9\.]+,[0-9\.]+,[0-9\.]+,([0-9\.]+)/; $rise_power = $1; $total_power = $rise_power + $fall_power; print OUTFILE "set_user_attribute [get_lib_cells $library_name/$cellname] total_power $total_power\n"; goto next_cell; } next_power: } } } } # Get clock buffers, CGC cells, etc. here elsif (($cellname =~ /CBUF/) || ($cellname =~ /CGC/) ) { while () { if ( ((/^\s+pin \(z/) && ( ($cellname =~ /CBUF/) )) || ((/^\s+pin \(clk/) && ($cellname =~ /cgc/)) ) { # # Now Find internal power sections.. # Make sure to use the one without a when: clause.. # We're lucky because it's always the last one # while () { if (/internal_power/) { $_ = ; $_ = ; # pick the 3rd line.. avg of that line $_ = ; # values line 1 $_ = ; # values line 2 $_ = ; # values line 3 /\s+"[0-9\.]+,[0-9\.]+,[0-9\.]+,([0-9\.]+)/; $fall_power = $1; $_ = ; # values line 4 $_ = ; # values line 5 $_ = ; # values line 6 $_ = ; # } line $_ = ; # rise_power line $_ = ; # values line 1 $_ = ; # values line 1 $_ = ; # values line 3 /\s+"[0-9\.]+,[0-9\.]+,[0-9\.]+,([0-9\.]+)/; $rise_power = $1; $total_power = $rise_power + $fall_power; print OUTFILE "set_user_attribute [get_lib_cells $library_name/$cellname] total_power $total_power\n"; goto next_cell; } next_power: } } } } } next_cell: }

  • SNUG Boston 2005 Analyzing Clock Trees 18

    11.2 Clock Analyzer Tcl script ################################################################### # Clock Analyzer # # Modes: # 1) Write out Excel .csv file for clock tree power per # clock regime with total current consumption per clock # showing where current comes from. (pin, reg, etc.) # 2) Write out Excel .csv file for clock tree power per # clock regime as we did for #1 -- except show results for # each level of the clock tree for each clock. # 3) Power plotting. Divide up the chip dimensions into # "bins" and show where the most current is being consumed # graphically. (red = most current, etc.) # 4) Clock Tree Plotting ################################################################### # Have to use PT 2004.12 with the command: # set read_parasitics_load_locations true # set before you read in the parasitics.. ################################################################### ################################################################### # Set up modes below. # Mode 1 = _rpt_summary # Mode 2 = _rpt_summary_level # Mode 3 = _plot_power # # For any of the above modes, you can have the script stop # at power-compiler-inserted CGC cells, assuming they are gating # the clock. Set the variable "_stop_at_pc_cgc" to 1 for this. ################################################################### set _rpt_summary 0 set _rpt_summary_level 0 set _plot_power 0 set _plot_tree 0 set _stop_at_pc_cgc 0 ################################################################### # For the three modes, give report directory names here where # to write files. ################################################################### # For mode 1 set _summary_dir "summary_results" if {$_rpt_summary == 1} { if { [ file exist $_summary_dir ] == 0 } { file mkdir $_summary_dir } } # For mode 2 set _summary_level_dir "summary_level_results" if {$_rpt_summary_level == 1} { if { [ file exist $_summary_level_dir ] == 0 } { file mkdir $_summary_level_dir } } # For mode 3 set _power_dir "power_results" if {$_plot_power == 1} { if { [ file exist $_power_dir ] == 0 } { file mkdir $_power_dir } } # For mode 4 set _plot_tree_file "test.gnuplot" set _plot_tree_cmd_file "test.cmd.gnuplot" if {$_plot_tree == 1} { set PLOT_TREE_FILE [open $_plot_tree_file w+] set PLOT_TREE_CMD_FILE [open $_plot_tree_cmd_file w+] } ###################################################################

  • SNUG Boston 2005 Analyzing Clock Trees 19

    # Power Setup info. # 1) VDD (in Volts) ################################################################### set _vdd 2.0 ################################################################### # This filter sets up all the valid cells on the clock tree # or sitting at the leaf of a tree. This should include # any clock tree buffers, inverters that area allowed, delay cells, # gating cells, memories, regs, etc. ################################################################### set _filter_valid_ct_cells "@ref_name =~ *BUF* || ref_name =~ *DFF* || ref_name =~ *CGC* || ref_name =~ *RAM*" ################################################################### # Need to know valid leaf cell names so know when to stop # tracing through cells. ################################################################### set _filter_valid_leaf_cells "@ref_name =~ *DFF* || ref_name =~ *RAM*" ################################################################### # Need to know CGC library cell name that Power Compiler will # use to insert clock gating cells. ################################################################### set _filter_cgc_name "*CGC*" ################################################################### # Need to know valid register library cell names. ################################################################### set _filter_reg_name "*DFF*" ################################################################### # Need to know valid clock buffer library cell names. ################################################################### set _filter_cbuf_name "*CBUF*" ################################################################### # Need to know valid memory library cell names. ################################################################### set _filter_mem_name "*RAM*" ################################################################### # Need to know valid misc library cell names that could show up. ################################################################### set _filter_misc_name "*INV*" ################################################################### # Write out header lines to .csv files if required to do so. ################################################################### if {$_rpt_summary == 1} { set CURRENT_SUM [open $_summary_dir/current_sum.csv w+] puts $CURRENT_SUM "Clock,Simple Clock Name,Pin,Wire,CGC,CBUF,Misc,Reg,Mem,Total,Last Stage Power,Freq,mA/MHz,Num Regs" } if {$_rpt_summary_level == 1} { set CURRENT_LEVEL [open $_summary_level_dir/current_level.csv w+] puts $CURRENT_LEVEL "Clock,Simple Clock Name,Level,Pin,Wire,CGC,CBUF,Misc,Reg,Mem,Total,Running Total,Freq,mA/MHz,Num Cells,Num Leaf Cells" } ##################################################### # This is only needed if you want to do gnuplot # plotting (power and/or clock). ##################################################### set _die_size_x 2000 set _die_size_y 2000 ##################################################### # This is only needed if you want to do gnuplot # clock plotting. # Then do:

  • SNUG Boston 2005 Analyzing Clock Trees 20

    # (execute gnuplot 4.0) # load "$plot_tree_cmd_file" ##################################################### # This is the file that was generated from PC to show hardmacro boundaries set _hm_ref_graph_file "pc_hm_boxes.graph" if {$_plot_tree == 1} { puts $PLOT_TREE_FILE "" puts $PLOT_TREE_CMD_FILE "set multiplot" puts $PLOT_TREE_CMD_FILE "set key off" puts $PLOT_TREE_CMD_FILE "set style line 1" puts $PLOT_TREE_CMD_FILE "set style line 6" puts $PLOT_TREE_CMD_FILE "plot [0:$_die_size_x][0:$_die_size_y] '$_hm_ref_graph_file' with lines ls 6" puts $PLOT_TREE_CMD_FILE "plot [0:$_die_size_x][0:$_die_size_y] '$_plot_tree_file' with lines ls 1" } ##################################################### # This is only needed if you want to do gnuplot # power plotting # # Then do: # (execute gnuplot 4.0) # load "$_power_dir/gnuplot.script" # plot [0:$_die_size_x][0:$_die_size_y] 'pc_hm_boxes.graph' with lines ls 6 ##################################################### # # Specify how many bins in X and Y direction for power plotting # Should be nice even number from chip dimensions. # set _num_x_bins 20 set _num_y_bins 20 set _x_bin_size [expr $_die_size_x / $_num_x_bins] set _y_bin_size [expr $_die_size_y / $_num_y_bins] # # Set up power bins initialized to 0 # if {$_plot_power == 1} { for {set i 0} {$i= $_num_y_bins]} { set y_bin $_num_y_bins } set _current_bin($x_bin,$y_bin) [expr $_current_bin($x_bin,$y_bin) + $power] } ##################################################### # Initialize variables that should span across # all calls to trace_clock_tree procedure. These

  • SNUG Boston 2005 Analyzing Clock Trees 21

    # are used to sum up all #'s so we can get entire # chip stats. ##################################################### set _last_stage_power 0 set _top_total_current_due_to_pins 0 set _top_total_current_due_to_wires 0 set _top_total_current_due_to_cgcs 0 set _top_total_current_due_to_cbufs 0 set _top_total_current_due_to_misc 0 set _top_total_current_due_to_regs 0 set _top_total_current_due_to_mems 0 set _top_total_current_all 0 ##################################################### # Finally, the real procedure call gets defined. ##################################################### proc trace_clock_tree { _myclock _myclockname _mylevel _freq } { ######################################################### # Set up global variables to access here. ######################################################### global _vdd global _plot_tree global PLOT_TREE_FILE global PLOT_TREE_CMD_FILE global _last_stage_power global _plot_power global _stop_at_pc_cgc global _filter_valid_ct_cells global _filter_valid_leaf_cells global _max_level global _num_cells_at_level global _num_regs global _total_current_at_level global _total_wire_current_at_level global _total_mem_current_at_level global _total_pin_current_at_level global _total_cgc_current_at_level global _total_cbuf_current_at_level global _total_misc_current_at_level global _total_reg_current_at_level global _total_leafs_at_level global _top_total_current_due_to_pins global _top_total_current_due_to_wires global _top_total_current_due_to_cgcs global _top_total_current_due_to_cbufs global _top_total_current_due_to_misc global _top_total_current_due_to_regs global _top_total_current_due_to_mems global _top_total_current_all global CURRENT_SUM global CURRENT_LEVEL global _summary_level_dir global _rpt_summary global _rpt_summary_level global _last_stage_power_for_clock global _filter_cgc_name global _filter_reg_name global _filter_cbuf_name global _filter_mem_name global _filter_misc_name ######################################################### # Store the original level we were at coming into # this routine. Increment it for use inside this # routine. ######################################################### set orig_mylevel $_mylevel incr _mylevel; ######################################################### # If this is the first time we are being called # for a clock, let's initialize a bunch of variables.

  • SNUG Boston 2005 Analyzing Clock Trees 22

    # The arrayed variables need to be initialized also. # I picked 10000 just as some high number. I hope no # clock trees are 10000 levels deep! ######################################################### if {$orig_mylevel == 0} { set _max_level 0 set _num_regs 0 set _last_stage_power_for_clock 0 } ######################################################### # Keep track of max clock tree depth. ######################################################### if {[expr $_mylevel > $_max_level]} { set _max_level $_mylevel } ######################################################### # Let's initialize/set some arrayed variables # if they don't already exist. ######################################################### if {![info exists _num_cells_at_level($_mylevel)]} { set _num_cells_at_level($_mylevel) 0; } if {![info exists _total_current_at_level($_mylevel)]} { set _total_current_at_level($_mylevel) 0; } if {![info exists _total_wire_current_at_level($_mylevel)]} { set _total_wire_current_at_level($_mylevel) 0; } if {![info exists _total_pin_current_at_level($_mylevel)]} { set _total_pin_current_at_level($_mylevel) 0; } if {![info exists _total_cgc_current_at_level($_mylevel)]} { set _total_cgc_current_at_level($_mylevel) 0; } if {![info exists _total_reg_current_at_level($_mylevel)]} { set _total_reg_current_at_level($_mylevel) 0; } if {![info exists _total_mem_current_at_level($_mylevel)]} { set _total_mem_current_at_level($_mylevel) 0; } if {![info exists _total_cbuf_current_at_level($_mylevel)]} { set _total_cbuf_current_at_level($_mylevel) 0; } if {![info exists _total_misc_current_at_level($_mylevel)]} { set _total_misc_current_at_level($_mylevel) 0; } if {![info exists _total_leafs_at_level($_mylevel)]} { set _total_leafs_at_level($_mylevel) 0; } ######################################################### # Ok, let's begin. # # Get the cell and its X,Y coordinates for plotting, # if needed later. ######################################################### set orig_buf_cell [get_cells -of_objects $_myclock] set orig_buf_loc_x [get_attribute $orig_buf_cell x_coordinate_max] set orig_buf_loc_y [get_attribute $orig_buf_cell y_coordinate_max] ######################################################### # Get immediate fanouts of this cell, filtered for only # valid clock tree/leaf cell types. Remember to # remove the original cell we started with. ######################################################### set myclock_bufs [remove_from_collection \ [filter [all_fanout -flat -only_cells -levels 1 -from $_myclock] \ $_filter_valid_ct_cells] \ [get_cells -of_objects $_myclock] ] ######################################################### # Get output net from this net and gather capacitance info #

  • SNUG Boston 2005 Analyzing Clock Trees 23

    # Let's get the net associated with the output of # the cell in question. Then get the pin and wire # capacitances for that net. If this cell isn't driving # anything (weird case), then don't do anything. ######################################################### set net_name [all_connected $_myclock] if {[sizeof_collection $net_name] != 0} { set wire_cap [get_attribute -class net $net_name wire_capacitance_max] set pin_cap [get_attribute -class net $net_name pin_capacitance_max] set tot_cap [get_attribute -class net $net_name total_capacitance_max] ######################################################### # Store the current value due to this net into # variables for later use. Use divide ratio as needed # to get from cap units to desired current units to report. ######################################################### set _total_current_at_level($_mylevel) [expr \ $_total_current_at_level($_mylevel) + \ ($tot_cap * $_vdd * $_freq / 1000000)] set _total_pin_current_at_level($_mylevel) [expr \ $_total_pin_current_at_level($_mylevel) + \ ($pin_cap * $_vdd * $_freq / 1000000)] set _total_wire_current_at_level($_mylevel) [expr \ $_total_wire_current_at_level($_mylevel) + \ ($wire_cap * $_vdd * $_freq / 1000000)] } ######################################################### # For debugging purposes, it may be useful to know # what cells were rejected due to our filters above. # Uncomment this below if you want to know that. ######################################################### # list rejected ones.. #set rejected [remove_from_collection [remove_from_collection [all_fanout -flat -only_cells -levels 1 -from $_myclock] $myclock_bufs] [get_cells -of_objects $_myclock]] #foreach_in_collection rejected_inst $rejected { # set rejected_name [get_attribute $rejected_inst full_name] # echo "rejected: $rejected_name" #} ######################################################### # Ok, let's go through all the fanouts from the startpoint # at this level. We need to recursively traverse each # one (if not a leaf). We also need to start gathering # all sorts of stats. ######################################################### ######################################################### # We need this found_leaf variable to help us calculate # the power at the last stage of the clock tree. We don't # want to count the last buffer and net more than once! ######################################################### set found_leaf 0 foreach_in_collection mybuf $myclock_bufs { incr _num_cells_at_level($_mylevel) set cell_name [get_attribute $mybuf full_name] set ref_name [get_attribute $mybuf ref_name] set lib_cell_name [get_attribute [get_lib_cells -of_objects $mybuf] full_name] set new_buf_loc_x [get_attribute [get_cells $cell_name] x_coordinate_max] set new_buf_loc_y [get_attribute [get_cells $cell_name] y_coordinate_max] if {$_plot_tree == 1} { puts $PLOT_TREE_FILE "$orig_buf_loc_x $orig_buf_loc_y" puts $PLOT_TREE_FILE "$new_buf_loc_x $new_buf_loc_y" puts $PLOT_TREE_FILE "" } ############################################################################### # If we stop at PC-CGCs and we're at a PC-CGC, we need to include the # average power of clk pin. (not through cell which is what's in "total_power". ###############################################################################

  • SNUG Boston 2005 Analyzing Clock Trees 24

    if {($_stop_at_pc_cgc == 1) && [string match $_filter_cgc_name $ref_name]} { set pwr [get_attribute [get_lib_cell $lib_cell_name] gated_power] } else { set pwr [get_attribute [get_lib_cell $lib_cell_name] total_power] } # Just in case we didn't set the user attribute for this cell type here # let's use 0 so our computations below don't die. if {$pwr == ""} { echo "Warning: $cell_name ($ref_name) has no total_power attribute set on it" set pwr 0 } set _total_current_at_level($_mylevel) \ [expr $_total_current_at_level($_mylevel) + ($pwr * $_freq / $_vdd / 1000000)] if {$_plot_power == 1} { store_current_in_bin $new_buf_loc_x $new_buf_loc_y [expr ($pwr * $_freq / $_vdd / 1000000)] } if [string match $_filter_reg_name $ref_name] { incr _num_regs set _total_reg_current_at_level($_mylevel) \ [expr $_total_reg_current_at_level($_mylevel) + ($pwr * $_freq / $_vdd / 1000000)] } elseif [string match $_filter_mem_name $ref_name] { set _total_mem_current_at_level($_mylevel) \ [expr $_total_mem_current_at_level($_mylevel) + ($pwr * $_freq / $_vdd / 1000000)] } elseif [string match $_filter_cbuf_name $ref_name] { set _total_cbuf_current_at_level($_mylevel) \ [expr $_total_cbuf_current_at_level($_mylevel) + ($pwr * $_freq / $_vdd / 1000000)] } elseif [string match $_filter_cgc_name $ref_name] { set _total_cgc_current_at_level($_mylevel) \ [expr $_total_cgc_current_at_level($_mylevel) + ($pwr * $_freq / $_vdd / 1000000)] } elseif [string match $_filter_misc_name $ref_name] { set _total_misc_current_at_level($_mylevel) \ [expr $_total_misc_current_at_level($_mylevel) + ($pwr * $_freq / $_vdd / 1000000)] } ################################################################################### # If we are at a PC-CGC and we need to stop here, record it (else part of this # if statement) and don't trace beyond it. # # If we are at a leaf cell, record it (else part of this if statement) and # don't trace beyond it. # # Otherwise, continue down tree. ################################################################################### if {([sizeof [filter [get_cells $cell_name] $_filter_valid_leaf_cells]] == 0) && !([string match $_filter_cgc_name $ref_name] && ($_stop_at_pc_cgc == 1))} { set myoutput_pin [get_pins -of_objects $mybuf -filter "@pin_direction == out"] ################################################################### # Trace down tree more! ################################################################### trace_clock_tree $myoutput_pin $_myclockname $_mylevel $_freq } else { ################################################################### # We are at a leaf. ################################################################### incr _total_leafs_at_level($_mylevel) set _last_stage_power [expr $_last_stage_power + ($pwr * $_freq / $_vdd / 1000000)] set _last_stage_power_for_clock [expr $_last_stage_power_for_clock + ($pwr * $_freq / $_vdd / 1000000)] ################################################################### # Store cap due to wire+pin and last leaf cell

  • SNUG Boston 2005 Analyzing Clock Trees 25

    ################################################################### if {$found_leaf == 0} { set found_leaf 1 set current [expr $tot_cap * $_vdd * $_freq / 1000000] # Add in net between buffer and this reg only once set _last_stage_power [expr $_last_stage_power + ($tot_cap * $_vdd * $_freq / 1000000)] set _last_stage_power_for_clock [expr $_last_stage_power_for_clock + ($tot_cap * $_vdd * $_freq / 1000000)] } } } ############################################################### # If finishing up an entire clock (done with recursion) # Let's write out stats for this clock ############################################################### if {$orig_mylevel == 0} { set _total_current_due_to_pins 0 set _total_current_due_to_wires 0 set _total_current_due_to_cgcs 0 set _total_current_due_to_cbufs 0 set _total_current_due_to_misc 0 set _total_current_due_to_regs 0 set _total_current_due_to_mems 0 set _total_leafs 0 set _total_current_all 0 for {set i 1} {$i

  • SNUG Boston 2005 Analyzing Clock Trees 26

    puts $CURRENT_LEVEL "$_myclock,$_myclockname,$i,$_total_pin_current_at_level($i),$_total_wire_current_at_level($i),$_total_cgc_current_at_level($i),$_total_cbuf_current_at_level($i),$_total_misc_current_at_level($i),$_total_reg_current_at_level($i),$_total_mem_current_at_level($i),$_total_current_at_level($i),$_total_current_all,$_freq,$ma_mhz,$_num_cells_at_level($i),$_total_leafs_at_level($i)" } } set ma_mhz [expr $_total_current_all / $_freq] if {$_rpt_summary == 1} { puts $CURRENT_SUM "$_myclock,$_myclockname,$_total_current_due_to_pins,$_total_current_due_to_wires,$_total_current_due_to_cgcs,$_total_current_due_to_cbufs,$_total_current_due_to_misc,$_total_current_due_to_regs,$_total_current_due_to_mems,$_total_current_all,$_last_stage_power_for_clock,$_freq,$ma_mhz,$_num_regs" } ############################################################### # Let's reset these back to 0 so next run doesn't # have to set them to 0 again. ############################################################### for {set i 0} {$i

  • SNUG Boston 2005 Analyzing Clock Trees 27

    puts $SUMMARY_FILE "Total CGC current: $_top_total_current_due_to_cgcs" puts $SUMMARY_FILE "Total CBUF current: $_top_total_current_due_to_cbufs" puts $SUMMARY_FILE "Total Misc current: $_top_total_current_due_to_misc" puts $SUMMARY_FILE "Total reg current: $_top_total_current_due_to_regs" puts $SUMMARY_FILE "Total mem current: $_top_total_current_due_to_mems" puts $SUMMARY_FILE "Last stage current: $_last_stage_power ($percentage % of total current)" puts $SUMMARY_FILE "Total current: $_top_total_current_all" close $SUMMARY_FILE } if {$_rpt_summary_level == 1} { set SUMMARY_LEVEL_FILE [open $_summary_level_dir/summary w+] puts $SUMMARY_LEVEL_FILE "Total pin current: $_top_total_current_due_to_pins" puts $SUMMARY_LEVEL_FILE "Total wire current: $_top_total_current_due_to_wires" puts $SUMMARY_LEVEL_FILE "Total CGC current: $_top_total_current_due_to_cgcs" puts $SUMMARY_LEVEL_FILE "Total CBUF current: $_top_total_current_due_to_cbufs" puts $SUMMARY_LEVEL_FILE "Total Misc current: $_top_total_current_due_to_misc" puts $SUMMARY_LEVEL_FILE "Total reg current: $_top_total_current_due_to_regs" puts $SUMMARY_LEVEL_FILE "Total mem current: $_top_total_current_due_to_mems" puts $SUMMARY_LEVEL_FILE "Last stage current: $_last_stage_power ($percentage % of total current)" puts $SUMMARY_LEVEL_FILE "Total current: $_top_total_current_all" close $SUMMARY_LEVEL_FILE } echo "Total pin current: $_top_total_current_due_to_pins" echo "Total wire current: $_top_total_current_due_to_wires" echo "Total CGC current: $_top_total_current_due_to_cgcs" echo "Total CBUF current: $_top_total_current_due_to_cbufs" echo "Total Misc current: $_top_total_current_due_to_misc" echo "Total reg current: $_top_total_current_due_to_regs" echo "Total mem current: $_top_total_current_due_to_mems" echo "Last stage current: $_last_stage_power ($percentage % of total current)" echo "Total current: $_top_total_current_all" ############################################################### # For Plotting power, there's lots of stuff to write out.. ############################################################### if {$_plot_power == 1} { set GNUPLOT_POWER_SCRIPT [open $_power_dir/gnuplot.script w+] puts $GNUPLOT_POWER_SCRIPT "set key off" puts $GNUPLOT_POWER_SCRIPT "set style line 1" puts $GNUPLOT_POWER_SCRIPT "set style line 2" puts $GNUPLOT_POWER_SCRIPT "set style line 3" puts $GNUPLOT_POWER_SCRIPT "set style line 4" puts $GNUPLOT_POWER_SCRIPT "set style line 5" puts $GNUPLOT_POWER_SCRIPT "set style line 6" puts $GNUPLOT_POWER_SCRIPT "set style line 7" puts $GNUPLOT_POWER_SCRIPT "set style line 8" puts $GNUPLOT_POWER_SCRIPT "set multiplot" set _max_current_in_bin 0 set _min_current_in_bin 9999999 for {set i 0} {$i

  • SNUG Boston 2005 Analyzing Clock Trees 28

    set fname "$_power_dir/box_${i}_${j}" set OUTFILE [open $fname w+] set bottom_x [expr $i*$_x_bin_size] set bottom_y [expr $j*$_y_bin_size] set top_x [expr $bottom_x + $_x_bin_size] set top_y [expr $bottom_y + $_y_bin_size] puts $OUTFILE "$bottom_x $bottom_y" puts $OUTFILE "$bottom_x $top_y" puts $OUTFILE "" puts $OUTFILE "$bottom_x $bottom_y" puts $OUTFILE "$top_x $bottom_y" puts $OUTFILE "" puts $OUTFILE "$top_x $bottom_y" puts $OUTFILE "$top_x $top_y" puts $OUTFILE "" puts $OUTFILE "$bottom_x $top_y" puts $OUTFILE "$top_x $top_y" puts $OUTFILE "" close $OUTFILE if {[expr $_current_bin($i,$j) < ($bin_current_width+$_min_current_in_bin)]} { set ls 2 } elseif {[expr $_current_bin($i,$j) < ($bin_current_width*2+$_min_current_in_bin)]} { set ls 3 } elseif {[expr $_current_bin($i,$j) < ($bin_current_width*3+$_min_current_in_bin)]} { set ls 7 } elseif {[expr $_current_bin($i,$j) < ($bin_current_width*4+$_min_current_in_bin)]} { set ls 8 } elseif {[expr $_current_bin($i,$j) < ($bin_current_width*5+$_min_current_in_bin)]} { set ls 4 } else { set ls 1 } puts $GNUPLOT_POWER_SCRIPT "plot [0:$_die_size_x][0:$_die_size_y] '$_power_dir/box_${i}_${j}' with filledcurves xy=$bottom_x,$bottom_y ls $ls" } } close $GNUPLOT_POWER_SCRIPT set SUMMARY_COLORS [open $_power_dir/summary_colors w+] set t [expr $bin_current_width+$_min_current_in_bin] puts $SUMMARY_COLORS "colors: green = < $t mA" set t [expr $bin_current_width*2+$_min_current_in_bin] puts $SUMMARY_COLORS "colors: blue = < $t mA" set t [expr $bin_current_width*3+$_min_current_in_bin] puts $SUMMARY_COLORS "colors: light orange = < $t mA" set t [expr $bin_current_width*4+$_min_current_in_bin] puts $SUMMARY_COLORS "colors: dark orange = < $t mA" set t [expr $bin_current_width*5+$_min_current_in_bin] puts $SUMMARY_COLORS "colors: magenta = < $t mA" set t [expr $bin_current_width*6+$_min_current_in_bin] puts $SUMMARY_COLORS "colors: red = < $t mA" close $SUMMARY_COLORS }