1
A Scalable Model for Timing Error Prediction under Hardware and Workload Variations Xun Jiao, Abbas Rahimi, Rajesh Gupta Computer Science&Engineering, University of California, San Diego v NSF Expedition in Computing, Variability-Aware Software for Efficient Computing with Nanoscale Devices http://variability.org Manufacturing, environmental and workload variabilities lead to timing errors in hardware. Mitigate Variations Conservative guardbands Efficiency loss Resilient technique: 1) Predict&Prevent 2) Error ignorance Build a model to predict timing error at 95% accuracy Vary parameters: workload, V=0.13V, T=50° C Reduce 0%-15% guardband for approximate applications Guarantee output quality Timing error and model analysis flow Approximate applications and its bit level specification Reference PSNR=30dB Approximation Bit level specification for each operator type: fault injection and profiling. Input: application tolerance threshold, operator type While(bit i) if (random()>prob) output = output ^ (1<<bit[i]); check: (PSNR>26dB)?(prob--):(prob++); Output: prob is the bit specification. Experimental Result Reliability specification and prediction accuracy in Sobel filter (0.85V, 50° C)(red circle refers to violation) Bench mark 0 - 23 24 25 26 27 28 29 30 31 Sobel Filter 15/15 - 15/15 15/15 15/15 15/15 15/15 15/15 10/10 0/5 15/15 Gaussian Filter 15/15 - 15/15 10/10 10/10 10/10 10/10 10/10 10/10 10/10 10/10 Matrix Multiplication 15/15 - 15/15 10/10 15/15 15/15 15/15 15/15 15/15 15/15 15/15 DCT 15/15 - 15/15 15/15 10/15 15/15 15/15 15/15 15/15 15/15 15/15 Bench mark 0 - 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Sobel Filter 15/15 - 10/15 10/15 10/15 15/15 15/15 15/15 10/5 10/5 10/5 10/5 10/0 10/0 10/0 10/0 15/15 Gaussian Filter 15/15 - 15/15 15/15 15/15 15/15 10/10 5/5 10/5 10/5 10/5 10/5 10/5 10/5 5/5 0/0 10/5 Matrix Multiplication 15/15 - 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/10 15/10 15/10 DCT 15/15 - 15/15 15/15 15/15 15/15 15/15 10/10 15/10 15/10 10/10 15/5 15/5 10/5 10/10 0/5 15/15 Bit level guardband reduction percentage for the Adder/Multiplier at (0.72V, 0° C)/(0.85V, 50° C) Bench mark Multiplier Adder SQRT Sobel Filter 0/5 10/0 15/15 Gaussian Filter 10/10 0/0 - Matrix Multiplication 15/15 15/10 - DCT 10/15 0/5 - Instruction level guardband reduction percentage at (0.72V, 0° C) / (0.85V, 50° C) regarding different benchmarks Conclusion Manufacturing variability presents new challenges to continued scaling of microelectronic designs that can no longer be addressed by increasing design guardbands that are already over 40% of the nominal design targets. This project seeks to improve accuracy of predictive timing analysis as well improve application-specific error tolerance that help reduce the design guardbands. Determination of application level error tolerance provides a promising approach to "approximate computing" paradigm. * Proposed classifier model enables prediction of floating point unit timing error with an average accuracy of 95% based on four varying parameters: workload, operating voltage, temperature, clock speed. * The prediction accuracy satisfy bit level specification of various approximate computation applications. Guardband reduction ranging from 0%-15% could be achieved for different instructions using our model with a wide range of variability conditions: voltage variations V=0.13V, temperature variation of T=50° C. Variability leads to performance degradation Example: Training data {x i [t-1], x i [t]} is xx01…10xx01, where x is erroneous bit and target data {y i } is 010…1x0x1x. Classifier Model based on Supervised learning to predict timing errors Pipelined floating point unit(FPU) History notion Binary classifier Bit level granularity (V i , T i ) corner . -1 0 +1 -2 0 +2 -1 0 +1 +1 +2 +1 0 0 0 -1 -2 +1 Gx Gy 2 2 | | G G x y G Image processing application: Sobel filter&Gaussian filter. (PSNR = 26dB threshold) General purpose application: Matrix Multiplication, Discrete Cosine Transform. (Deviation = 10% threshold) Sobel filter approximate output 0.0000 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 0.8000 0.9000 1.0000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 probability bit position 0.0000 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 0.8000 0.9000 1.0000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 probability bit position 0.0000 0.1000 0.2000 0.3000 0.4000 0.5000 0.6000 0.7000 0.8000 0.9000 1.0000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 probability bit position specification 5% GBR 10% GBR 15% GBR Adder Multiplier SQRT Clock Actual circuit delay Guardband Across wafer frequency Vcc Drop Temperature Aging Sobel filter formulation: Adder, Multiplier and SQRT

A Scalable Model for Timing Error Prediction under Hardware and Workload Variationscseweb.ucsd.edu/~xujiao/poster/Expo_V3.pdf · 2015. 4. 18. · A Scalable Model for Timing Error

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Scalable Model for Timing Error Prediction under Hardware and Workload Variationscseweb.ucsd.edu/~xujiao/poster/Expo_V3.pdf · 2015. 4. 18. · A Scalable Model for Timing Error

A Scalable Model for Timing Error Prediction under

Hardware and Workload Variations

Xun Jiao, Abbas Rahimi, Rajesh GuptaComputer Science&Engineering, University of California, San Diego

vNSF Expedi t ion in Comput ing, Var iab i l i ty -Aware Sof tware for Ef f ic ient Comput ing wi th Nanoscale Devices h t tp : / /var iab i l i ty .org

• Manufacturing, environmental and workload variabilities lead to timing errors

in hardware.

• Mitigate Variations Conservative guardbands Efficiency loss

Resilient technique:

1) Predict&Prevent

2) Error ignorance

Build a model to predict timing error at 95% accuracy Vary parameters: workload, △V=0.13V, △T=50°C Reduce 0%-15% guardband for approximate applications

Guarantee output quality

Timing error and model analysis flowApproximate applications and its bit level specification

Reference PSNR=30dB

Approximation

Bit level specification for each operator type: fault

injection and profiling.

Input: application tolerance threshold, operator type

While(bit i)

if (random()>prob)

output = output ^ (1<<bit[i]);

check: (PSNR>26dB)?(prob--):(prob++);

Output: prob is the bit specification.

Experimental Result

Reliability specification and prediction accuracy in Sobel filter (0.85V, 50°C)(red circle refers to violation)

Bench

mark

0 - 23 24 25 26 27 28 29 30 31

Sobel Filter 15/15 - 15/15 15/15 15/15 15/15 15/15 15/15 10/10 0/5 15/15

Gaussian

Filter

15/15 - 15/15 10/10 10/10 10/10 10/10 10/10 10/10 10/10 10/10

Matrix

Multiplication

15/15 - 15/15 10/10 15/15 15/15 15/15 15/15 15/15 15/15 15/15

DCT 15/15 - 15/15 15/15 10/15 15/15 15/15 15/15 15/15 15/15 15/15

Bench

mark

0 - 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Sobel Filter 15/15 - 10/15 10/15 10/15 15/15 15/15 15/15 10/5 10/5 10/5 10/5 10/0 10/0 10/0 10/0 15/15

Gaussian Filter 15/15 - 15/15 15/15 15/15 15/15 10/10 5/5 10/5 10/5 10/5 10/5 10/5 10/5 5/5 0/0 10/5

Matrix

Multiplication

15/15 - 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/15 15/10 15/10 15/10

DCT 15/15 - 15/15 15/15 15/15 15/15 15/15 10/10 15/10 15/10 10/10 15/5 15/5 10/5 10/10 0/5 15/15

Bit level guardband reduction percentage for the Adder/Multiplier at (0.72V, 0°C)/(0.85V, 50°C)

Bench

markMultiplier Adder SQRT

Sobel Filter 0/5 10/0 15/15

Gaussian Filter 10/10 0/0 -

Matrix Multiplication 15/15 15/10 -

DCT 10/15 0/5 -

Instruction level guardband reduction percentage at(0.72V, 0°C) / (0.85V, 50°C) regarding different benchmarks

Conclusion•Manufacturing variability presents new challenges to continued scaling of microelectronic designs that can no

longer be addressed by increasing design guardbands that are already over 40% of the nominal design targets.

•This project seeks to improve accuracy of predictive timing analysis as well improve application-specific error

tolerance that help reduce the design guardbands.

•Determination of application level error tolerance provides a promising approach to "approximate computing"

paradigm.

* Proposed classifier model enables prediction of floating point unit timing error with an average accuracy of 95%

based on four varying parameters: workload, operating voltage, temperature, clock speed.

* The prediction accuracy satisfy bit level specification of various approximate computation applications.

•Guardband reduction ranging from 0%-15% could be achieved for different instructions using our model with a wide range of variability conditions: voltage variations △V=0.13V, temperature variation of △T=50°C.

Variability leads to performance degradation

Example:

Training data {xi [t-1], xi [t]}

is xx01…10xx01, where x

is erroneous bit and target

data {yi} is 010…1x0x1x.

Classifier Model based on Supervised learning to predict timing errors

Pipelined floating point

unit(FPU)

History notion

Binary classifier

Bit level granularity

(Vi, Ti) corner

.

-1 0 +1

-2 0 +2

-1 0 +1

+1 +2 +1

0 0 0

-1 -2 +1

Gx Gy

2 2| | G Gx yG

•Image processing application: Sobel filter&Gaussian

filter. (PSNR = 26dB threshold)

•General purpose application: Matrix Multiplication,

Discrete Cosine Transform. (Deviation = 10%

threshold)

Sobel filter approximate output

0.0000

0.1000

0.2000

0.3000

0.4000

0.5000

0.6000

0.7000

0.8000

0.9000

1.0000

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

pro

bab

ilit

y

bit position

0.0000

0.1000

0.2000

0.3000

0.4000

0.5000

0.6000

0.7000

0.8000

0.9000

1.0000

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

pro

bab

ilit

y

bit position

0.0000

0.1000

0.2000

0.3000

0.4000

0.5000

0.6000

0.7000

0.8000

0.9000

1.0000

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

pro

bab

ilit

y

bit position

specification

5% GBR

10% GBR

15% GBR

Adder Multiplier SQRT

Temperature

Clock

actual circuit delay

guardband

Aging VCC

Droo Across-wafer Frequency

Actual circuit delayGuardband

Across wafer frequencyVcc DropTemperatureAging

Sobel filter formulation: Adder, Multiplier and SQRT