12
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping of Integrated Circuits Using AC-Based Thermography Abdullah Nazma Nowroz, Gary Woods, and Sherief Reda Abstract—Post-silicon power validation is an important step in integrated circuit design and fabrication flow. It involves runtime power characterization of a fabricated chip under realistic loadings. The most versatile procedure for post-silicon power characterization involves capturing the thermal emissions from the back of the die and inverting the captured images to get power estimates. This process faces two major challenges: the spatial heat diffusion effect, which blurs the underlying power map, and measurement noise in the thermal imaging system. In this paper, we propose to use ac-based thermography, where ac excitation signals are applied to the chip instead of dc excitation signals, to improve post-silicon power mapping. We show that using ac excitation reduces the impact of flicker noise and spatial heat diffusion, which translates to significant improvements in power mapping accuracy. We perform a number of experiments using a test chip that can be programmed to control spatial and temporal power consumption. We use the test chip to analyze the noise in our thermal imaging system, and to quantify the improvements in power mapping attained from the proposed ac- based methodology. We elucidate the impact of the ac excitation frequency on both the signal-to-noise ratio and power mapping accuracy. We also demonstrate the basic applicability of our technique on a dual-core processor. Index Terms—Infrared imaging, post silicon validation, power mapping. I. I NTRODUCTION A CCURATE power modeling is an essential step in both mobile processors and server processors. The battery life of mobile processors is constrained by the power consumption. In server processors, the occurrence of junction hot spots, which reduces reliability, is determined by the temporal and spatial distribution of power [1]–[3]. Large-scale transistor- level power modeling is computationally very challenging, and as a result high-level power modeling is performed at the expense of accuracy. The dependency of power consumption on runtime workloads, temperature, and process variabilities leads to uncertainties in design-time power estimation models [4]–[6]. Pre-silicon power estimation tools need to Manuscript received December 29, 2011; revised April 30, 2012; accepted July 21, 2012. This work was supported in part by a DoD ARO Grant W911NF-09-1-0320, in part by the NSF Grant 1115424 and Grant 0952866, in part by an equipment grant from the Altera Corporation, and in part by a generous gift from the Intel Corporation. A. N. Nowroz and S. Reda are with the School of Engineering, Brown University, Providence, RI 02912 USA (e-mail: [email protected]; [email protected]). G. Woods is with the Electrical and Computer Engineering Department, Rice University, Houston, TX 77005 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2012.2211111 be complemented with post-silicon power characterization to determine the true power consumption of circuits in silicon. Post-silicon power maps are developed during the debugging and characterization phases of the first silicon, and then applied to improve the design during re-spins and for future designs. Post-silicon power results can also improve the accuracy of power and thermal models. The most versatile approach for post-silicon power mapping is to capture the thermal emissions from the back side of the die and to use these captures to estimate the power [3], [7]–[10]. However, this approach faces two major chal- lenges: 1) spatial low-pass filtering of the underlying power map arising from heat diffusion can cause information loss and 2) the measurement noise in the setup of the infrared imaging system limits the obtainable accuracy [9], [11]. In this paper, we present a post-silicon power mapping methodology, where ac excitation signals are used instead of dc excitation signals to improve mapping accuracy. We present analytical derivations and experimental validations, demonstrating that the proposed approach leads to significant improvements in the power mapping compared to dc-based methods. The major contributions of this paper are as follows. 1) We analyze and quantify the role of noise on the quality of thermal imaging. We break down the noise contribution into flicker and white noise, and analyze their dependencies on integration time and excitation frequencies. 2) We analyze the impact of ac excitation frequency on the extent of spatial heat diffusion and the strength of the thermal signal. We combine this analysis with the noise analysis to elucidate the relationship between the signal- to-noise (SNR) ratio, the excitation frequency, and the integration time. 3) We quantify the thermal-to-power model parameters of real chips at desired ac frequencies. Our method captures all modes of heat transfer, whereas previous methods relied on modeling parameters that are derived from analytical expressions that might not capture all modes of heat transfer occurring in real contexts. 4) Given the ac thermal emissions and model parameters, we devise a constrained convex optimization inversion procedure to estimate the post-silicon power maps. In our procedure, additional constraints obtained from lumped physical measurements, e.g., total power, are imposed on the solution space to reduce numerical error. 5) We design and implement a test chip where it is possible to precisely control its spatial and temporal dynamic 1063–8210/$31.00 © 2012 IEEE

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1

Power Mapping of Integrated Circuits UsingAC-Based Thermography

Abdullah Nazma Nowroz, Gary Woods, and Sherief Reda

Abstract— Post-silicon power validation is an important stepin integrated circuit design and fabrication flow. It involvesruntime power characterization of a fabricated chip underrealistic loadings. The most versatile procedure for post-siliconpower characterization involves capturing the thermal emissionsfrom the back of the die and inverting the captured images toget power estimates. This process faces two major challenges: thespatial heat diffusion effect, which blurs the underlying powermap, and measurement noise in the thermal imaging system. Inthis paper, we propose to use ac-based thermography, where acexcitation signals are applied to the chip instead of dc excitationsignals, to improve post-silicon power mapping. We show thatusing ac excitation reduces the impact of flicker noise and spatialheat diffusion, which translates to significant improvements inpower mapping accuracy. We perform a number of experimentsusing a test chip that can be programmed to control spatial andtemporal power consumption. We use the test chip to analyzethe noise in our thermal imaging system, and to quantify theimprovements in power mapping attained from the proposed ac-based methodology. We elucidate the impact of the ac excitationfrequency on both the signal-to-noise ratio and power mappingaccuracy. We also demonstrate the basic applicability of ourtechnique on a dual-core processor.

Index Terms— Infrared imaging, post silicon validation, powermapping.

I. INTRODUCTION

ACCURATE power modeling is an essential step in bothmobile processors and server processors. The battery life

of mobile processors is constrained by the power consumption.In server processors, the occurrence of junction hot spots,which reduces reliability, is determined by the temporal andspatial distribution of power [1]–[3]. Large-scale transistor-level power modeling is computationally very challenging,and as a result high-level power modeling is performed at theexpense of accuracy. The dependency of power consumptionon runtime workloads, temperature, and process variabilitiesleads to uncertainties in design-time power estimationmodels [4]–[6]. Pre-silicon power estimation tools need to

Manuscript received December 29, 2011; revised April 30, 2012; acceptedJuly 21, 2012. This work was supported in part by a DoD ARO GrantW911NF-09-1-0320, in part by the NSF Grant 1115424 and Grant 0952866,in part by an equipment grant from the Altera Corporation, and in part by agenerous gift from the Intel Corporation.

A. N. Nowroz and S. Reda are with the School of Engineering, BrownUniversity, Providence, RI 02912 USA (e-mail: [email protected];[email protected]).

G. Woods is with the Electrical and Computer Engineering Department,Rice University, Houston, TX 77005 USA (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TVLSI.2012.2211111

be complemented with post-silicon power characterization todetermine the true power consumption of circuits in silicon.Post-silicon power maps are developed during the debuggingand characterization phases of the first silicon, and thenapplied to improve the design during re-spins and for futuredesigns. Post-silicon power results can also improve theaccuracy of power and thermal models.

The most versatile approach for post-silicon power mappingis to capture the thermal emissions from the back side ofthe die and to use these captures to estimate the power[3], [7]–[10]. However, this approach faces two major chal-lenges: 1) spatial low-pass filtering of the underlying powermap arising from heat diffusion can cause information loss and2) the measurement noise in the setup of the infrared imagingsystem limits the obtainable accuracy [9], [11].

In this paper, we present a post-silicon power mappingmethodology, where ac excitation signals are used insteadof dc excitation signals to improve mapping accuracy. Wepresent analytical derivations and experimental validations,demonstrating that the proposed approach leads to significantimprovements in the power mapping compared to dc-basedmethods. The major contributions of this paper are as follows.

1) We analyze and quantify the role of noise on thequality of thermal imaging. We break down the noisecontribution into flicker and white noise, and analyzetheir dependencies on integration time and excitationfrequencies.

2) We analyze the impact of ac excitation frequency on theextent of spatial heat diffusion and the strength of thethermal signal. We combine this analysis with the noiseanalysis to elucidate the relationship between the signal-to-noise (SNR) ratio, the excitation frequency, and theintegration time.

3) We quantify the thermal-to-power model parameters ofreal chips at desired ac frequencies. Our method capturesall modes of heat transfer, whereas previous methodsrelied on modeling parameters that are derived fromanalytical expressions that might not capture all modesof heat transfer occurring in real contexts.

4) Given the ac thermal emissions and model parameters,we devise a constrained convex optimization inversionprocedure to estimate the post-silicon power maps.In our procedure, additional constraints obtained fromlumped physical measurements, e.g., total power, areimposed on the solution space to reduce numerical error.

5) We design and implement a test chip where it is possibleto precisely control its spatial and temporal dynamic

1063–8210/$31.00 © 2012 IEEE

Page 2: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

power consumption. We use the test chip to create alarge range of reference power maps. Runtime thermalemissions from the reference power maps are capturedusing an infrared camera and then processed to revealthe estimated post-silicon power maps. Our novel setupenables us to compare the estimated power maps to thereference power maps to evaluate the power mappingaccuracy. We quantify the impact of excitation frequencyon the accuracy of post-silicon power maps and relatethis accuracy to the SNR. In addition to our test chip,we demonstrate the basic applicability of our techniqueon a dual-core processor.

The organization of this paper is as follows. Section II dis-cusses relevant work on power mapping of integrated circuits,and Section III provides the necessary background informationon ac-based thermography. In Section IV, we analyze theimpact of excitation frequency and integration time on flickernoise and white noise. In Section V, we analyze the impactof excitation frequency on spatial heat diffusion and thermalsignal strength. We combine the analyses of Sections IV and Vto analyze the SNR trends in Section VI. We present ournumerical inversion optimization method in Section VII. Weprovide an extensive set of experimental results on a test chipin Section VIII, and demonstrate the basic applicability ofour technique on a dual-core processor in Section IX. Finally,Section X summarizes our main results.

II. RELEVANT WORK ON POWER MAPPING

Power modeling is concerned with finding the breakdownof total chip power among its blocks under real loadings.Pre-silicon power modeling is a topic of active research[4]–[6]. Statistical power analysis techniques are computa-tionally fast, but they only arrive at the expected powerconsumption of each block [4], and they cannot model theintricate input pattern dependencies among different blocksof an integrated circuit. Switch-level power simulators areaccurate but it is computationally infeasible to conduct large-scale switch-level simulations. The general idea of high-level power modeling techniques is to combine capacitanceestimates from design tools and switching activity estimatesfrom architectural simulation tools to compute the powerconsumption estimates of different processor blocks [5], [12].Pre-silicon high-level power models deviate from the post-silicon power measurements for a number of reasons. First,the required granularity of runtime power modeling demandsa sheer number of input statistics for each circuit block, manyof which cannot be accurately known at early design stages[6]. Second, different circuit design styles can lead to differentpower estimation results. Hence, the power models might notpredict the maximum power dissipation of custom blocks [5].Third, leakage power contributes a large fraction of the powerdissipation [2]. Leakage power depends on process variabilityand operating temperatures which are hard to model accuratelyduring design time.

Because of the challenges in pre-silicon power modeling,it is necessary to conduct post-silicon power mapping tovalidate design choices and analyses. Fig. 1 illustrates the main

Fig. 1. DC post-silicon power mapping framework using infrared emissions.

framework of dc post-silicon power mapping [3], [7]–[10].In this case, a dc excitation source, e.g., a workload of a stablenature or a test pattern, is applied to the chip under charac-terization, and the infrared emissions are captured from theback of the die using infrared imaging. The emissions are theninverted to get the power maps. Hamann et al. [3] introducedexperimental techniques to work with infrared imaging andproposed techniques to find the power estimates that minimizethe total squared error between the temperatures computedfrom the estimates and the actual temperature measurements.Martinez et al. [10] proposed the use of genetic algorithms torefine the obtained spatial power estimates into dynamic andleakage components. Qi et al. [8] analyzed the errors in powermaps obtained from transient thermal responses computed ina simulation environment.

In our previous work, we presented techniques to integratethe measurements from infrared imaging, embedded thermalsensors, and electrical current meters to accurately characterizethe temperatures and power of computing devices, such as adual-core processor, programmable logic devices, and embed-ded processors [13]. We have also proposed techniques fromregularization theory to reduce the noise artifacts in dc-basedpower mapping methodologies [9], [14]. We also presentednew techniques for thermal and power characterization of fieldprogrammable gate arrays (FPGAs) using infrared imagingtechniques [15].

III. BACKGROUND ON AC THERMOGRAPHY

There are two major challenges in post-silicon powermapping: 1) spatial heat diffusion [3], [9], [16], [17] and2) measurement noise in the thermal imaging system [9], [11].Heat diffusion blurs the underlying power map and reducesthe accuracy of post-silicon power maps as it filters out thespatial high-frequency power patterns. AC excitation reducesthe amount of spatial heat diffusion as the ac excitationfrequency increases [11], [18]. As a result, ac-based ther-mography improves the resolution of thermal images and thedetection of weak emission sources, which makes it a valuable

Page 3: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

NOWROZ et al.: POWER MAPPING OF INTEGRATED CIRCUITS USING AC-BASED THERMOGRAPHY 3

Fig. 2. AC-based thermography framework.

(a) (b)

Fig. 3. Steady-state ac emissions at 2 Hz in (a) time domain and (b) frequencydomain for one pixel.

tool in device characterization [19] and failure analysis ofintegrated circuits [11].

The general framework for ac thermography is given inFig. 2, where an ac excitation source rather than a dc excitationis applied to the integrated circuit. Applying a true sinusoidalac source to excite a digital circuit is impossible. Instead asquare wave is applied, and because a square wave can berepresented by a Fourier series, whose dominant componentis the fundamental frequency, such technique does not alter theresults as long as the acquired infrared emissions are filteredto only extract the fundamental frequency [11]. Creating acsquare-wave excitations in digital circuits can be implementedby a number of techniques, such as: 1) toggling enablesignals of circuit blocks while keeping the operating voltageconstant; 2) alternating the voltage supply signal between twooperational values (e.g., 0.9 V and 1 V); and 3) executingworkloads (for the case of processors) that alternate betweenan activity phase and inactivity phase. For example, Fig. 3(a)shows the thermal emissions of just one pixel over time arisingfrom a test circuit where a circuit block is toggled at 2 Hz,and Fig. 3(b) shows the Fourier spectrum of the waveform,

clearly showing the amplitude at the fundamental frequency.The amplitudes of all pixels at the fundamental frequency formthe ac-based thermal image, as shown in Fig. 2.

In the context of fault isolation, Breitenstein [11] discussesmethods to use ac lock-in techniques to deblur thermal imagesby deconvolution to reveal sharper power images that identifya failure site with excessive leakage power consumption.However, no practical way of measuring the thermal-to-powermodeling parameters are presented and instead the parametersare derived from analytical derivations that do not capture allmodes of heat transfer (e.g., convection at surface and radi-ation) in a real chip. Furthermore, the reported experimentalsetup and results are not devised to quantify intricate powermaps, but rather to isolate faults where only a handful oflocations at most are actively consuming power.

In contrast to previous usages of ac-based thermography,we use ac-based thermography for the purpose of post-siliconpower mapping, where intricate spatial power maps producedfrom tens of circuit blocks can be simultaneously estimatedfrom their combined thermal emissions.

IV. IMPACT OF USING AC EXCITATION ON NOISE

The objective of this section is to analyze and quantify theimpact of noise on ac-based thermography. Previous worksassumed white noise on the measurements [11]. We willshow in this section that 1/ f flicker noise has a significantcontribution to the measurement noise.

Let Ti (t) denote the temperature of pixel i at time t asrecorded by the thermal imaging system, and let P denote theintegration period of the measurements. Then the temperaturemagnitude, Ti , of pixel i is given by

Ti ={

1P

∫ P0 Ti (t)dt, dc case

2P ‖ ∫ P

0 Ti (t)e−2π j f0t dt ‖2, ac case(1)

where f0 is the fundamental frequency of the ac excitationsource, and ‖ · ‖2 is the L2 norm, which gives the magnitudeof a complex number in this case. If there is no noise inthe measurements, then we expect Ti to exhibit no stochasticbehavior. However, noise in the measurements leads to astochastic process, where Ti is a random variable. The mainsources of noise in the infrared imaging system are; 1) thermalnoise; 2) digitization noise; 3) dark noise; and 4) flickernoise [20]. Thermal noise is caused by agitation of chargecarriers, and is present in all electronic devices. Digitizationnoise arises from the use of analog-to-digital converters in theinfrared camera. Dark noise is caused by random generation ofelectron-hole pairs in the quantum detectors, which is usuallypresent in photosensitive devices. Flicker noise is also presentin almost all electronic devices, which is related to the trappingand detrapping fluctuations of charge carries at the transistorinterfaces [21]. We can divide the noise sources into twocategories: frequency-dependent and frequency-independent.The first three sources of noise do not show dependency onfrequency, and thus, they can be modeled as white noise. Whitenoise has a flat power spectral density that does not varywith the frequency. The last source of noise has frequencydependency, and is usually termed a 1/ f noise.

Page 4: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

0.25 0.5 0.75 1.0 2.0 3.0 4.0 5.06.0 70.5

1

1.5

2

2.5

Frequency (Hz)

Tem

pera

ture

(m

K)

Fig. 4. Log-log plot of noise amplitude as a function of frequency. Dashedblue line gives the noise amplitude from measurements using an integrationtime of 16 s. Red line gives the fitting to measurements. Corner frequency isobserved between 2 and 3 Hz. Noise amplitude fitting before corner frequencyyields an amplitude of 9.5 × 10−4 f −0.51.

The amplitude of noise is equal to the standard deviation ofthe values of Ti . The noise amplitude is commonly referred toas noise equivalent temperature difference (NETD) [11], [22],and is given by

N ET D f0,P =√√√√1

k

k∑i=1

(Ti − T̄ )2 (2)

where T̄ is the mean value of the pixel over k successivemeasurements. If all the pixels are experiencing the sameconstant temperature, then it does not matter whether theNETD is calculated from k successive measurements of onepixel or from evaluating k pixels on one image.

A. Noise Reduction Using Higher Frequencies

One of the main advantages of using ac excitation isthat the frequency-dependent flicker noise component reducessignificantly as frequency increases. To analyze the noise inour infrared system, we implement a simple test chip where acircuit block is placed at the center of an otherwise idle chip(the full details of the test chip are given in Section VIII).The block is toggled at different frequency rates 0.25, 0.5, . . .,8 Hz. At each frequency, we capture the thermal emissions for400 s at steady state using a frame rate of 100 Hz. We thenuse (1) (with an integration time of P = 16 s) to compute themagnitude of the thermal signal at the fundamental frequency.Thus, each pixel yields 400/16 magnitudes at the fundamentalfrequency. The noise amplitude is then computed as thestandard deviation of these temperature magnitudes as givenby (2).

We plot the noise amplitude (dashed blue line) as a functionof frequency in Fig. 4 on a log-log scale. The plot showsthat the noise decreases as a function of frequency until itreaches a corner frequency, beyond which the noise amplitudedoes not show improvement with frequency increase. In oursystem, this corner frequency is somewhere between 2–3 Hz.Fitting the measurements gives a noise amplitude equal to

N ET D f,P ={

9.5×10−4 f −0.51, f < corner frequency

6.2×10−4, f ≥ corner frequency.(3)

4 6 8 10 12 14 160.5

1

1.5

2

2.5

Integration time (s)

Noi

se a

mpl

itude

(m

K)

Fig. 5. Log-log plot of noise amplitude as a function of integration time atexcitation frequency 2 Hz. Dashed blue line gives actual measurement. Redsolid line gives results from fitting.

The fitted trends are given by the solid red line in Fig. 4.Thus, the noise spectral density, which is equal to the variance,has a near perfect 1/ f characteristic as it is equal to the squareof the NETD. This indicates strong dominance of flicker noiseat low frequencies. Thus, increasing the excitation frequencyreduces the impact of flicker noise. However, increasing theexcitation frequency beyond the corner frequency has no noisebenefit as noise becomes dominated by white noise, which hasflat frequency characteristics.

B. Noise Reduction Using Larger Integration Times

As discussed earlier, the second component of noise iswhite noise. White noise has a Gaussian distribution, and thusby increasing the integration time, we reduce the standarddeviation of the thermal signal which is responsible for theamplitude of noise. From the central limit theorem, we knowthat the standard deviation of the average of a number of asamples of a random variable has 1/

√n dependency on the

number of samples n. Thus, by increasing the integration time,we can reduce white noise proportionally to the square rootof integration time [11].

To analyze the white noise in our infrared system, weimplement a simple test chip where a circuit block is placedat the center of an otherwise idle chip. The block is toggledat 2 Hz, and the thermal emissions are captured for 400 sat steady state using a camera’s frame rate of 100 Hz. Wethen use (1) to compute the magnitude of the thermal signalat 2 Hz for various integration times of 4, 6, . . ., 16 s. Foreach integration time, the noise amplitude is computed as thestandard deviation of the thermal magnitudes as given by (2).

We plot the noise amplitude (dashed blue line) as a functionof integration time in Fig. 5 on a log-log scale. The plotshows the dependency on the reciprocal of the square rootof integration time (solid red line with a fit ∝ 1/n0.52).

V. IMPACT OF USING AC EXCITATION ON SPATIAL

TEMPERATURE SIGNAL

In this section, we analyze the impact of ac excitation onspatial heat diffusion and the strength of the thermal signal.

Page 5: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

NOWROZ et al.: POWER MAPPING OF INTEGRATED CIRCUITS USING AC-BASED THERMOGRAPHY 5

We first mathematically analyze this phenomenon and thendemonstrate it experimentally with a test chip.

We first assume a semi-infinite isotropic and homogeneoussilicon substrate with standard wafer thickness of 750 μm.Heat transfer inside the substrate is governed by the heatdiffusion equation

D∇2T (�r , t) + p(�r , t) = ∂

∂ tT (�r , t) (4)

where T (�r , t) is the temperature as a function of the radialdistance �r from the center of the substrate at time t , p(�r, t)is the power function, and D is the thermal diffusivity ofsilicon (D = 0.88 cm2/s). The standard thickness of 200- and300-mm wafers is about 750 μm, and memory and mobilechips are commonly thinned to less than 100 μm. By compar-ison, at 20 Hz the thermal diffusion length L D ≡ √

D/2π fis more than 800 μm. This justifies treating heat transfer as a2-D problem.

We first consider a sinusoidal point heat source at �r = 0of unit amplitude that is toggled in time with an angularfrequency ω = 2π f ; i.e., p(�0, t) = e jωt . From a practicalperspective, a point heat source has a physical extent smallerthan the spatial resolution of the thermal imaging equipment,and thus occupies at most one pixel. We can also express thetemperature function in polar coordinates: T (�r , t) = T (r, t),where r is the radial distance from the source and withoutany dependency on the angle, due to the rotational symmetryof the single-point excitation. The thermal response T (r, t) tothe unit point heat source is the Green’s function g(r, t). Atsteady-state ac, the boundary conditions of the setup are

g(r → ∞, t) = 0 and limr→0

r∂

∂rg = e jωt. (5)

The standard Green’s function solution for (4) subject to theboundary constraints of (5) is given by

g(r, t) = K0

(r

√jω

D

)e jωt (6)

where K0(·) is the modified Bessel function of the secondkind. K0(·) is a complex function and its amplitude and phasecannot be described analytically. Thus, we plot in Fig. 6the amplitude and phase of K0(r

√jω/D) as a function of

r√

ω/D. For a given r , the figure shows that the amplitudeof the temperature decays at a faster rate as ω increases. Thatis, the temporal excitation frequency controls the extent ofspatial heat diffusion. In ac thermography, the thermal imageat the fundamental frequency is the one that is used as it hasthe strongest signal value; thus, in the frequency-domain, onlythe amplitude of g(r, t) at the fundamental frequency, f0, isused. That is

gω0(r) = K0

(r

√jω0

D

)(7)

where w0 is equal to 2π f0. Fig. 7(a) shows in a visual illustra-tion the amplitudes of the fundamental frequency componentof gω0(r) for a number of excitation frequencies dc (0 Hz),1, 2, 4, and 8 Hz. The images in Fig. 7(a) show a cleartrend where the spatial extent of heat diffusion reduces as

Fig. 6. Amplitude and phase of the Green’s function gω(r) as a function ofr√

ω/D.

the excitation frequency increases. To quantify this reduction,we plot in Fig. 7(b) and (c) the temperature as a function ofthe distance (in mm) from the center of the point source andthe excitation frequency. Fig. 7(c) gives normalized results ofFig. 7(b). We observe the following.

1) The thermal “inertia” of silicon reduces the amplitudeof the temperature change at higher ac frequencies. Forexample, the plot of Fig. 7(c) shows that the signal dropsto 40% of its peak value at a distance of 3.5 mm fordc, 0.6 mm for 1 Hz, and 0.35 mm for 8 Hz. Thus,using higher excitation frequencies has the advantage ofreducing the extent of thermal diffusion and as a resultthe contrast of the captured thermal images is improved.

2) Increasing the frequency has the disadvantage of reduc-ing the absolute signal value as shown in Fig. 7(b) Thisreduction in average signal value ultimately reduces thebenefit of ac-based thermography at higher frequencies.

To experimentally demonstrate the impact of excitationfrequency on spatial heat diffusion, we implement a simpletest circuit where a circuit block (approximating a pointsource) is placed at the center of an otherwise idle chip.The block is toggled through an enable signal at variousexcitation rates, and the ac thermography procedure is usedto compute the amplitude of every pixel at the fundamentalfrequency according to (1). The thermal amplitudes at thefundamental frequency of all pixels are plotted in Fig. 8(a),and the normalized thermal signals as a function of distancefrom the block are given in Fig. 8(b). The results confirm thatusing ac excitation reduces the extent of heat diffusion.

To understand the impact of using ac excitation on the finalquality of the thermal image, we combine next the analysisof this section, which focused on estimating the signal value,with the noise analysis in Section IV to analyze the SNR asa function of the excitation frequency.

VI. SIGNAL-TO-NOISE RATIO (SNR) ANALYSIS

In Section IV, we observed that increasing the excita-tion frequency decreases the flicker noise up to the cornerfrequency. Beyond the corner frequency, noise is dominated

Page 6: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

dc 1 Hz 2 Hz 4 Hz 8 Hz

(a)

(b)

(c)

Fig. 7. Impact of increasing excitation frequency on spatial heat diffusionas computed from the analysis. (a) Theoretical temperature distribution ina 2-D slab of silicon with a point heat source. (b) Theoretical temperatureversus radial distance from a point heat source. (c) Normalized theoreticaltemperature versus radial distance from a point heat source.

by white noise rather than flicker noise. Thus, increasing theexcitation frequency beyond the corner frequency does notreduce the noise; instead, increasing the integration time ismore beneficial. In Section V, we observed that increasing theac excitation frequency reduces the extent of heat diffusion,which reduces the blurring of the underlying power map;however, increasing the excitation frequency has the disad-vantage of reducing the absolute temperature signal value atthe fundamental frequency. In this section, we investigate theimpact of the excitation frequency on both the signal and thenoise. The SNR is a combined metric that gives the quality ofthe thermal image.

A. Theoretical Analysis of SNR

To analyze the SNR, we first compute the average thermalsignal over a radius of 3.5 mm using (7) at various frequenciesfrom dc to 8 Hz, and then divide the results at each frequencyby the corresponding noise amplitude as given by (3). TheSNR values as a function of frequency are plotted in Fig. 9.Clearly, the SNR improves as the frequency increases at thebeginning because the reduction in flicker noise and in spatialheat diffusion outweighs the reduction in signal value. But asthe frequency is increased beyond the corner frequency, theSNR worsens due to the reduction of the signal strength andlack of reduction in noise.

(a)DC 1 Hz 2 Hz 4 Hz 8 Hz

(b)

Fig. 8. Impact of increasing excitation frequency on spatial heat diffusion asmeasured from the test chip. (a) Measured temperature distribution in a testchip with a point heat source. (b) Normalized measured temperature versusradial distance from a point heat source.

Fig. 9. Results from theoretical signal-to-noise analysis.

B. Empirical Analysis of SNR

To measure the SNR, we use the thermal maps measured inthe context of the experiment corresponding to Fig. 8, wherewe calculate the average temperature over a radius of 3.5 mmfrom the center of our chip area. The 3.5-mm radius is chosento cover our experimental test chip area. We divide the resultsat each frequency by the measured noise amplitude as given inFig. 4. The SNR values as a function of frequency are plottedin Fig. 10. The empirical results show the same trends as thetheoretical SNR plot of Fig. 9, where both plots relativelyagree with each other. For instance, the empirical results ofFig. 10 show a degradation of the SNR by 57% from its peakvalue of 140 to 80 at 8 Hz. The same reduction percentage ispredicted from the theoretical results given in Fig. 9.

Our analysis and results show that for our test chip andimaging equipment, the peak SNR occurs around 2–3 Hz.Our analysis procedure is applicable for other chips andother imaging systems, though the exact peak SNR frequencymight differ. This difference could arise because of differentflicker noise characteristics in the imaging equipment readoutelectronics and from the dimensions (in particular the thick-ness) of the test chip. Reducing the thickness of the chip canshift the peak SNR to higher frequencies.

Page 7: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

NOWROZ et al.: POWER MAPPING OF INTEGRATED CIRCUITS USING AC-BASED THERMOGRAPHY 7

Fig. 10. Empirical SNR at different frequencies with fixed integration time(16 s).

VII. THERMAL-TO-POWER INVERSION METHOD

In a discretized form, the steady-state dc relationshipbetween power and temperature can be succinctly describedusing

RP + ε = T (8)

where R is power-to-thermal operator matrix, vector P is thepower vector that gives the power consumption of every circuitblock, ε is the measurement noise vector, and vector T isthe temperature vector that gives the recorded temperatures atevery pixel of the thermal image [3]. The length of vectorP is determined by the number of circuit blocks, and thelength of T is determined by the camera’s spatial resolutionand the dimensions of the die. In steady-state ac, (8) isstill valid, except that the elements of vectors P and T arecomplex numbers that give the amplitudes and phases ofpower and temperature at the fundamental frequency. If thephase of the power signal is considered as reference, thenP is real and T is complex. The operator matrix R is nowfrequency-dependent and its elements are complex numbersas described in Section V. We denote the frequency-dependentmodel matrix by R f , and the dc model matrix by R0.

For power mapping purposes, three steps need to be con-ducted: 1) T needs to be measured for the chip under charac-terization after applying the appropriate workload and waitingfor ac or dc steady state; 2) R needs to be estimated for theactual chip in its deployed environment; and 3) given T, R, andthe total power consumption, a numerical inversion proceduremust be carried to find P. We explain each of these steps inthe remainder of this section.

A. Measuring T

The thermal imaging system captures a discretized thermalemission field with an image sampling rate that is at leasttwice larger than the ac excitation frequency according tothe Nyquist sampling criterion. To measure T, we capture anumber of consecutive frames for an integration time, P , asillustrated earlier in Fig. 3. The temperature at each pixel iscomputed as

Ti ={

1P

∫ P0 Ti (t)dt, dc case

2P

∫ P0 Ti (t)e−2π j f0t dt, ac case.

(9)

In the ac case, the temperature Ti of pixel i is a complexnumber that gives the amplitude and phase of the thermal waveat pixel i . The temperatures from all pixels form the elementsof the vector T.

B. Estimating the R Operator Matrix

For practical power mapping, it is necessary to estimatethe operator matrix R accurately. The model parameters mustcapture all modes of heat transfer (e.g., conduction throughoutthe solid, convection and radiation at the surface). Thus,analytical derivations as described in Section V are not suf-ficient. We estimate the model parameters directly from thetest chip using the following procedure. For each excitationfrequency f , the matrix R f can be estimated in a column-by-column basis as follows. Exciting only the kth circuit block ismathematically equivalent to setting the vector P to be equal to[0 0 · · · Pk · · · 0 0]′, where Pk denote the total additional power(at the fundamental frequency) incurred from exciting block k.The total power of a test chip can be readily measured usingan external digital ammeter. Dividing the captured temperaturevector T by Pk gives the values of the kth column of the matrixR f . Thus, we can measure R f column-by-column by enablingeach block one-by-one and repeating the described procedure.It is also possible to carry the same procedure in simulation.In this case, it is first necessary to construct a finite-elementmodel (FEM) of the test chip, its substrate board, and itsenvironment, and then the use of the FEM within a numericalsimulation environment to estimate the model parameters inthe same conceptual way as the described procedure. Thematrix R f is a function of f and thus it needs to be measuredat each desired excitation frequency.

C. Inverting Temperature to Power

Given the amplitude measurements T, R, and total power,the objective is to find the best power map vector P thatminimizes the total squared error between the temperaturesas computed from the estimated power P and the measuredtemperatures T, that is

argP min ‖ R f P − T ‖22 (10)

where || · ||2 denotes the L2 norm, and under the constraintsthat the sum of the elements of the power vector is equal tothe total power consumption of the chip ptotal and that theindividual power estimates of circuit blocks must be greaterthan zero

‖ P ‖1=∑

k

Pk = ptotal (11)

∀k : Pk ≥ 0 (12)

where || · ||1 denotes the L1 norm. The total power is readilymeasured through an external digital multimeter. Any multi-meter has its own tolerance (δ), so for practical purposes wechange the constraints in (11) to be two inequalities:∑

k

Pk ≥ ptotal − δ (13)

∑k

Pk ≤ ptotal + δ. (14)

Page 8: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

6

6

Five Levels 0 mW25 mW60 mW

102 mW142 mW

10

10

Two Levels 0 mW

25 mW

(a) (b)

Fig. 11. Grid of microprogrammable heaters. (a) Grid of bi-level micro-heaters. (b) Grid of multilevel microheaters.

In our implementation, we use MATLABs quadratic opti-mization solver (lsqlin) to minimize (10) under the constraintsof (12)–(14). The solver uses the active-set strategy (alsoknown as a projection method), which relies on a two-stepsolution. The first step calculates a feasible solution point, andthe second phase generates an iterative sequence of feasiblesolution points that converge to the final solution.

VIII. EXPERIMENTAL RESULTS

The objective of our experiments is to assess the improve-ment in power mapping accuracy using the proposedac-based framework. For validation purposes, we design atest chip, where we can control exactly the switching activity,which yields reference power maps. By knowing the referencepower maps, we can scientifically validate our thermal-to-power inversion technique as it is applied to estimate thespatial power maps. The basic unit of our test circuit is aprogrammable micro heater, which consists of a number ofring oscillators (ROs) that are controlled by flip-flops thatdetermine the operational status of the micro heater. We createtwo kinds of micro heater designs.

1) Bi-Level Micro Heaters: A bi-level micro heater consistsof nine 15-stage ROs together with one flip-flop thatcontrols their operational status. If the D flip-flop (DFF)holds a binary value of 1 then the heater is turned on;otherwise, it is turned off. When enabled, each microheater consumes 25 mW. This value is obtained directlyby measuring the incremental power of a single microheater block when enabled using an external digitalmultimeter. Using the programmable heater, a grid thatconsists of 10 × 10 micro heaters is created as shown inFig. 11(a). In the grid structure, the output of each DFFis connected to the input of the DFF of the consecutiveheater forming a scan chain.

2) Multilevel Micro Heaters: A multilevel micro heaterblock consists of ROs that can be programmed to oneof the following configurations. 9 51-stage ROs witha power consumption of 25 mW, 18 25-stage ROs with apower consumption of 60 mW, 27 19-stage ROs with apower consumption of 102 mW, and 36 13-stage ROswith a power consumption of 142 mW. Thus, each microheater block offers five different power levels: 0, 25, 60,102, and 142 mW. The DFFs associated with each microheater determine its status. Using the programmablemicro heater blocks, a grid that consists of 6 × 6 blocks

(a)

9.8 mm

8.8 mm

(b)

7.9 mm

7.2 mm

Fig. 12. Implementation areas in Altera Stratix II EP2S180 device.(a) 10 × 10 bi-level heater grid. (b) 6 × 6 multi-level heater grid.

is created as shown in Fig. 11(b). As described above,the output of each DFF is connected to the input of theDFF of the consecutive heater, forming a scan chain.

In both designs, the advancement of the programming bitsin the chain is controlled by the clock signal. To create anydesired power map, we inject control bits into the flip-flopsof the micro heaters to selectively turn on the micro heatersthat correspond to the required power map. Our experimentalnovelty of using a chip of programmable micro heaters enablesus to achieve the following two experimental goals, whichhave not been attained in previous thermal-imaging work:1) the grid structure of the micro heaters, where every microheater can be selectively controlled, enables us to create anydesired spatial power map on a real chip and 2) the regularand homogenous structure of the micro heater grid enablesus to estimate the power consumption of each micro heater,by simply measuring the total power consumption of thegrid and dividing it by the number of enabled micro heaters.Our test chip enables us to know the reference power mapsby construction, which gives us the means to scientificallyevaluate our post-silicon mapping techniques.

For implementation, we choose a 90-nm Altera Stratix II(EP2S180) FPGA with 180 000 logic elements and a totaldie area of 23 × 24 mm. The regular fabric of the FPGAideally fits our design. For our experiments, we use a relativelyhomogenous section of the die that spans 7.2 × 7.9 mm forthe bi-level 10 × 10 grid as shown in Fig. 12(a), and a sectionthat spans 8.8 × 8.9 mm for the multi level 6 × 6 grid. Themicro heater blocks are mapped to the logic array blocks atthe precise grid locations using Altera’s Quartus II placementassignment editor. In order to capture the chip’s thermalemissions, it was necessary to remove the heat spreader. Whileremoving the heat spreader is going to change the spatialthermal behavior, the change in spatial thermal emissionsdoes not change the underlying dynamic power consumption( f CV 2), which is weakly dependent on temperature. Thespatial power consumption remains relatively intact, and thenew interactions between temperature and power are capturedin the learned R operator matrix.

We use a FLIR SC5600 infrared camera with a mid-waveinfrared spectral range of 2.5–5.1 μm to capture the thermalemissions from the back-side of the die. The specificationsof the camera give a NETD of 15 mK, but our NETD dcmeasurements give a NETD of about 25 mK. We operate thecamera at a frame rate of 100 Hz with a spatial resolution

Page 9: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

NOWROZ et al.: POWER MAPPING OF INTEGRATED CIRCUITS USING AC-BASED THERMOGRAPHY 9

Fig. 13. Error in post-silicon power mapping for dc and ac excitation without rounding.

of 30 μm with a 0.5× microscopy kit. We also utilize anAgilent E3634A power supply to supply and measure thepower consumption of the test chip.

A. Experiment 1

For this experiment, we utilize our bi-level microheater grid(10 × 10 microheaters, each consumes 25 mW when enabled)to create a number of reference maps and then we capturethe resultant thermal emissions using the infrared camera. Thereference power maps are given in the first column of Fig. 13.We use our thermal-to-power inversion method to estimate thespatial power maps and compare them with the reference mapsfor validation. We conduct power estimation for traditionaldc excitation and ac excitation with frequencies of 0.5, 1,2, 4, 8 Hz using an integration time of 16 s. We report theestimated power maps in Fig. 13. The percentage error foreach individual power map is computed as the absolute errorbetween the reference power map and the estimated powermap normalized by the total power of the map. That is,

Error =∑

k |Pk − Pcorrectk |∑k Pcorrectk

(15)

where Pcorrect is the reference power map and Pk is thevalue of the kth element in the vector P. The results show

that ac-based inversion gives significant reduction in powermapping error compared to dc-based inversion. The averageerror decreases from 40% at dc to about 8.5% in the ac (2 Hz)method. Rounding the results of the ac method to the nearestlevel (0 or 25 mW) yields a prefect estimation of the referencemaps with no errors.

B. Comparison of Power Mapping Error AgainstExcitation Frequencies

We summarize the power mapping percentage errors ofFig. 13 in Fig. 14, which gives the average power mappingpercentage error for the five shown maps against all theexcitation frequencies, including dc. The trends in the powermapping accuracy results are in agreement with the SNRresults provided earlier in Section VI. It is clearly observed thatthe power mapping error reduces drastically when we movefrom dc excitation to ac excitation until the 2–4 Hz range.Increasing the excitation frequency beyond that range leads toa loss in power mapping accuracy.

C. Comparison of Power Mapping Error AgainstIntegration Times

In Section IV, we analyzed the dependency of noise on theintegration time. Increasing the integration time (especially at

Page 10: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

0 1 2 3 4 5 6 7 80

5

10

15

20

25

30

35

40

Frequency (Hz)

Ave

rage

per

cent

age

erro

r (%

)

Fig. 14. Percentage of error versus different frequencies with fixed integrationtime (16 s).

2 4 6 8 10 12 14 165

10

15

20

25

30

35

40

Integration time (s)

Ave

rage

pow

er m

appi

ng e

rror

(%

) 4 Hz2 Hz

Fig. 15. Percentage of error versus different integration times.

and beyond the corner frequency) reduces the noise, whichtranslates into improved power mapping results. In Fig. 15,we plot the average percentage error for the five spatial mapsas a function of the integration time for 2- and 4-Hz powermapping results. The plot shows consistent improvements inaccuracy as the integration time is increased. Increasing theintegration time requires additional time for data collection andprocessing and additional space for storing the measurementdata.

D. Experiment 2

In this experiment, we provide results using the multi levelmicro heater grid design (6 × 6 microheaters; each canconsume 0, 25, 60, 102, 142 mW when enabled). In contrastto the previous experiment, where the power of each microheater was only restricted to two levels (0 and 25 mW), theconstructed reference spatial power maps in this experimentcomprises of power levels of varying intensities (0, 25, 60,102, and 142 mW). Our inversion procedure can naturallyhandle any number of levels, and the results of this experimentconfirm this capability. Fig. 16 gives the reference multi levelpower maps in the first column. The second column gives theestimated power maps using dc-based excitation. The averageerror is 21.23%. We give in the third column the results

reference maps dc

0 25 60 102 142 mW

rounded dc ac (2 Hz) rounded ac (2 Hz)

14.84% 4.76% 9.09% 0.00%

24.15% 1.81% 4.95% 0.00%

26.88% 3.10% 7.73% 2.76%

19.05% 8.24% 11.45% 1.71%

Fig. 16. Results from multi level power estimation.

of the dc-based power maps when they are rounded to thenearest power level. In the rounded case the average errordrops to an average of 4.48%. In the fourth column we givethe results from using the proposed ac-based technique at2 Hz; the average error is 8.30% which again confirms thedrastic improvement compared to the dc case. The last columngives the rounded power maps from ac-based estimation withan average error of 1.11%. The results of this experimentconfirm that our technique is capable of handling a largerange of power intensities and spatial power maps, and thatac-based power mapping consistently outperforms dc-basedpower mapping with a large margin.

IX. APPLICABILITY TO GENERAL-PURPOSE PROCESSORS

In this section, we demonstrate the basic applicability ofour technique on general-purpose processors. A full ac-basedpower mapping demonstration is infeasible to the authors asnone of the commercial processors has its design available inthe public domain. Thus, we focus on demonstrating the abilityto: 1) alternate between two discrete power levels at differentexcitation frequencies and 2) analyze the resultant thermalimages to show the reduction of the spatial heat diffusionas a function of the excitation frequency. We also describe aprocedure to generate the thermal-to-power modeling matrix(R) for arbitrary designs.

For our demonstration, we utilize an AMD Athlon II dual-core processor that is embedded in a motherboard with 4 GBof memory and running the Linux operating system. Thelayout of the processor is given in Fig. 17. We execute afloating-point application of a stable nature on one of thecores. To alternate the power level between two values, weexecute a script that alternates the dynamic voltage-frequencysetting (DVFS) between two levels. For example, Fig. 18shows the power signal of the processor alternating between

Page 11: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

NOWROZ et al.: POWER MAPPING OF INTEGRATED CIRCUITS USING AC-BASED THERMOGRAPHY 11

14 mm

8.5

mm

Fig. 17. AMD Athlon II dual-core processor chosen for demonstration.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2514

16

18

20

22

24

26

28

Time (s)

Pro

cess

or p

ower

(W

)

Fig. 18. Impact of alternating DVFS between two levels of power.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

1

2

3

4

5

Frequency (Hz)

Am

plitu

de (

W)

Fig. 19. Frequency-domain representation of the power signal.

two values completing a full cycle every 1.1 s. Fig. 19 givesthe amplitude of the power signal in the frequency domain,clearly showing the fundamental component of the squarewave at 0.9 Hz and its odd harmonics. Switching the DVFSsettings takes a few microseconds, which is a negligibleamount of time compared to the frequency of alternatingthe DVFS setting, which results in excellent frequency domaincharacteristics.

Fig. 20(a) gives the dc thermal map when the DVFS settingsdo not alternate. Using our DVFS alternating script, wecollect the thermal emissions and process them as describedin Section III for the cases of 0.9 and 1.8 Hz. Fig. 20(b)gives the ac thermal map when DVFS settings alternate at0.9 Hz, and Fig. 20(c) gives the ac thermal map when DVFSsettings alternate at 1.8 Hz. The thermal maps show thatthe extent of spatial heat diffusion reduces as the alternationfrequency of the DVFS setting increases. The results confirmthe applicability of our ac-based technique on general-purposeprocessors.

(a)

(b)

(c)

Fig. 20. Thermal maps at various excitation frequencies. (a) Thermal imagewith dc excitation. (b) Thermal image with 0.9 Hz ac excitation. (c) Thermalimage with 1.8 Hz ac excitation.

In Section VIII, we utilized the programmability of FPGAsto estimate the model matrix R. For other chips like thedual-core processor, the matrix R can be estimated in thesame conceptual way but through a different implementationapproach [3], [10]. One approach is to turn off the chip, andscan a laser beam with known power density to deliver thepower from the outside to the regions of interest. The scanningof the laser system can be automated by using a pair of galvo-directing mirrors [3]. Another approach is to use the actualdesign and layout of the chip to conduct a fluid dynamicsimulation coupled with a heat diffusion simulation to estimatethe matrix R [3], [10].

X. CONCLUSION

In this paper, we investigated the challenges in powermapping and proposed ac-based thermography techniques toovercome the challenges. We presented theoretical analysisand experimental validation for the impact of ac excitationon measurement noise and spatial heat diffusion. To quantifythe noise in our system, we analyzed and quantified theSNR. We devised techniques for realistic estimation of the

Page 12: IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI ...scale.engin.brown.edu/pubs/vlsi13.pdf · IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 Power Mapping

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

12 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

parameters of the thermal-to-power operator matrix, and wedevised numerical techniques to invert the thermal emissionsinto power estimates. We crafted a programmable test chip toscientifically evaluate the accuracy of our thermal-to-powerinversion methods. Our test chip enabled us to create anydesired spatial power map. Using a number of constructedintricate power maps, we demonstrated that ac-based powermapping dramatically improves post-silicon power mapping.We analyzed the power mapping results for different ac excita-tion frequencies and integration times, and linked these resultsto the SNR analysis. We also demonstrated the applicabilityof our techniques on a dual-core processor.

ACKNOWLEDGMENT

The authors would like to thank S. Angelevski for hishelp with the multi-level power mapping experiments, and theanonymous reviewers who helped improve the content of thispaper.

REFERENCES

[1] S. H. Gunther, “Managing the impact of increasing microprocessorpower consumption,” Int. Technol. J., vol. 5, no. 1, pp. 1–9, 2001.

[2] D. Brooks, R. Dick, R. Joseph, and L. Shang, “Power, thermal, andreliability modeling in nanometer-scale microprocessors,” IEEE Micro,vol. 27, no. 3, pp. 49–62, May 2007.

[3] H. Hamann, A. Weger, J. Lacey, Z. Hu, and P. Bose, “Hotspot-limitedmicroprocessors: Direct temperature and power distribution measure-ments,” IEEE J. Solid-State Circuits, vol. 42, no. 1, pp. 56–65, Jan.2007.

[4] F. Najm, “Power estimation techniques for integrated circuits,” in Proc.Int. Conf. Comput.-Aided Design, 1995, pp. 492–499.

[5] D. Brooks, V. Tiwari, and M. Martonosi, “Wattch: A framework forarchitectural-level power analysis and optimizations,” in Proc. Int. Symp.Comput. Arch., 2000, pp. 83–94.

[6] M. Powell, A. Biswas, J. Emer, and S. Mukherjee, “CAMP: A tech-nique to estimate per-structure power at run-time using a few simpleparameters,” in Proc. Int. Symp. High Perform. Comput. Arch., 2009,pp. 289–300.

[7] A. N. Nowroz, G. Woods, and S. Reda, “Improved post-silicon powermodeling using ac lock-in techniques,” in Proc. Design Autom. Conf.,2011, pp. 101–106.

[8] Z. Qi, B. H. Meyer, W. Huang, R. J. Ribando, K. Skadron, and M. R.Stan, “Temperature-to-power mapping,” in Proc. Int. Conf. Comput.Design, 2010, pp. 384–389.

[9] R. Cochran, A. Nowroz, and S. Reda, “Post-silicon power characteriza-tion using thermal infrared emissions,” in Proc. Int. Symp. Low PowerElectron. Design, 2010, pp. 331–336.

[10] F. J. Mesa-Martinez, M. Brown, J. Nayfach-Battilana, and J. Renau,“Measuring performance, power, and temperature from real processors,”in Proc. Int. Symp. Comput. Arch., 2007, pp. 1–10.

[11] O. Breitenstein, W. Warta, and M. Langenkamp, Lock-In Thermography:Basics and Use for Functional Diagnostics of Electronic Components,2nd ed. New York: Springer-Verlag, 2010.

[12] S. Wilton and N. P. Jouppi, “CACTI: An enhanced cache access andcycle time model,” IEEE J. Solid-State Circuits, vol. 31, no. 5, pp. 677–688, May 1996.

[13] S. Reda, “Thermal and power characterization techniques for realcomputing systems,” IEEE J. Emerg. Topics Circuits Syst., vol. 1, no. 2,pp. 76–87, Jun. 2011.

[14] S. Reda, A. N. Nowroz, R. Cochran, and S. Angelevski, “Post-siliconpower mapping techniques for integrated circuits,” VLSI Integr. J., 2012,to be published.

[15] A. N. Nowroz and S. Reda, “Thermal and power characterization offield-programmable gate arrays,” in Proc. Int. Symp. Field Program.Gate Array, 2011, pp. 111–114.

[16] K. Etessam-Yazdani, M. Asheghi, and H. Hamann, “Investigation ofthe impact of power granularity on chip thermal modeling using whitenoise analysis,” IEEE Trans. Compon. Packag. Technol., vol. 31, no. 1,pp. 211–215, Mar. 2008.

[17] W. Huan, M. R. Stan, K. Sankaranarayanan, R. J. Ribando, andK. Skadron, “Many-core design from a thermal perspective,” in Proc.Design Autom. Conf., 2008, pp. 746–749.

[18] E. Marin, “The role of thermal properties in periodic time-varyingphenomena,” Eur. J. Phys., vol. 28, no. 3, pp. 429–445, 2007.

[19] S. Huth, O. Breitenstein, A. Huber, and D. Dantz, “Lock-in ir-thermography − a novel tool for material and device characterization,”Solid State Phenomena, nos. 82–84, pp. 741–746, 2002.

[20] C. Hsieh, C. Wu, F. Jih, and T. Sun, “Focal-plane-arrays and cmosreadout techniques of infrared imaging systems,” IEEE Trans. CircuitsSyst. Video Technol., vol. 7, no. 4, pp. 594–605, Aug. 1997.

[21] J. Chang, A. A. Abidi, and C. R. Viswanathan, “Flicker noise inCMOS transistors from subthreshold to strong inversion at varioustemperatures,” IEEE Trans. Electron. Devices, vol. 41, no. 11, pp. 1965–1971, Nov. 1994.

[22] A. Rogalski and K. Chrzanowski, “Infrared devices and techniques,”Opt. Electron. Rev., vol. 10, no. 2, pp. 111–136, 2002.

Abdullah Nazma Nowroz graduated (summa cumlaude) in electrical and electronics Engineering fromBoston University, Boston, MA, and the receivedthe Master’s degree in electrical engineering fromthe University of Southern California, Los Angeles,in 2006 and 2008, respectively. She is currentlypursuing the Ph.D. degree in electrical engineeringwith Brown University, Providence, RI.

Her current research interests include various ther-mal management methods and characterization ofsemiconductor circuits.

Ms. Nowroz is a member of ACM, Tau Beta Pi, Eta Kappa Nu, and NSCS.

Gary Woods received the B.S.E.E. and B.A. degreesin physics from Rice University, Houston, TX, in1988, and the M.S. and Ph.D. degrees in appliedphysics from Stanford University, Stanford, CA, in1991 and 1997, respectively.

He was with Silicon Valley for Intel Corp. andtwo startup companies Spectralane Inc. (Co-Founderand CTO) and Optonics Inc. (acquired by Cre-dence Systems Corp.) from 1998 to 2008, where hisresearch was focused on development and utilizationof optical probing techniques for debugging and

failure analysis of integrated circuits. He joined Rice University in 2008 andwas also a Co-Founder of InView Technology Corp in 2009. He is currentlya Professor in the Practice with the Department of Electrical and ComputerEngineering, Rice University.

Sherief Reda received the Ph.D. degree in computerscience and engineering from the University of Cal-ifornia, San Diego, in 2006.

He is currently an Assistant Professor with theSchool of Engineering, Brown University, Provi-dence, RI. His current research interests includethermal and power characterization and managementof computing systems, physical design techniques,and variability modeling and yield improvementtechniques for planar and 3-D integrated circuits.

Dr. Reda was a recipient of the Best Paper Awardfrom the DATE’02 and the ISLPED’10, and the NSF CAREER Award in2010. He was also a four time nominee for the Best Paper Award.