A Reconfigurable Stochastic Architecture for Highly Reliable Computing

Preview:

DESCRIPTION

Xin Li, Weikang Qian, Marc Riedel, Kia Bazargan & David Lilja. A Reconfigurable Stochastic Architecture for Highly Reliable Computing. Electrical & Computer Engineering. University of Minnesota. GLSVLSI, Boston – May 12, 2009. Opportunities & Challenges. Topological constraints . - PowerPoint PPT Presentation

Citation preview

Xin Li, Weikang Qian, Marc Riedel, Kia Bazargan & David Lilja

A Reconfigurable Stochastic Architecture for Highly Reliable Computing

A Reconfigurable Stochastic Architecture for Highly Reliable Computing

Electrical & Computer EngineeringUniversity of Minnesota

A

B

C

GLSVLSI, Boston – May 12, 2009

Opportunities & Challenges

• Topological constraints.• Inherent structural randomness.• High defect rates.

Novel materials, devices, technologies:

Challenges for logic synthesis:

• High density of bits/logic/interconnects.

{

{

N Wires

M Wires

Opportunities & Challenges

Strategy:• Cast synthesis in terms of arithmetic

operations on real values.• Synthesize circuits that compute

logical values with probability corresponding to the real-valued inputs and outputs.

{

{

N Wires

M Wires

Probabilistic Signals

Claude E. Shannon1916 –2001

“A Mathematical Theory of Communication” Bell System Technical Journal, 1948.

deterministic

random

deterministic

Probabilistic Analysis

• Circuit Reliability – Probabilistic fault models.– Random test pattern generation.

• Statistical Timing Power (circuit level).

• Statistical Performance Measures (architectural level).

Probabilistic Analysis“There are known knowns; and there are unknown

unknowns; but today I’ll speak of the known unknowns.”

– Donald Rumsfeld, 2004

Independent

Known

Unknown

ProbabilisticInputs

ProbabilisticOutputs

DigitalCircuit

Probabilistic Analysis“There are known knowns; and there are unknown

unknowns; but today I’ll speak of the known unknowns.”

– Donald Rumsfeld, 2004

ProbabilisticInputs

ProbabilisticOutputs

DigitalCircuit

Synthesis of Probabilistic Circuits

Unknown(for us to design)

SpecifiedIndependent

Known

Unknown

Synthesis of Probabilistic Logic

• Shannon and von Neumann:– “Probabilistic Logic,”– “Reliable Circuits Using Less Reliable Relays”.

• K. Nepal, R. Bahar, J. Mundy, W. Patterson, and A. Zaslavsky, “Designing Logic Circuits for Probabilistic Computation in the Presence of Noise.”

• L. Chakrapani, P. Korkmaz, B. Akgul, and K. Palem, “Probabilistic System-on-a-chip Architecture.”

Stochastic Logic

Probability values are the input and output signals.

combinationalcircuit0.7

0.616

0.468

combinationalcircuitt

Stochastic Logic

Probability values are the input and output signals.

24.06.0 tt

3.08.08.0 2 tt

Functions of a probability value t.

X

Y

X

Y

Z

Z

(independently)tZX )1Pr()1Pr(

3.0)1Pr( Y

t

t

t

t

0.3

0.3

24.06.0 tt

3.08.08.0 2 tt

Stochastic Logic

Stochastic Bit Streams

A real value x in [0, 1] is encoded as a stream of bits X.For each bit, the probability that it is one is: P(X=1) = x.

x = 2/50,1,0,1,0

X

Probabilistic Bundles

01001

X

A real value x in [0, 1] is encoded as a stream of bits X.For each bit, the probability that it is one is: P(X=1) = x.

x = 2/5

Stochastic Logic

5/8

3/8

4/8

3/8

4/8

8/8

Probability values are the input and output signals.

combinationalcircuit

Stochastic LogicProbability values are the input and output signals.

1,1,0,1,0,1,1,0…

1,0,0,0,1,1,0,0,…

0,1,1,0,1,0,1,0,…

0,1,1,0,1,0,0,0,…

1,0,1,0,1,0,1,0,…

1,1,1,1,1,1,1,1,…

serial bit streams

combinationalcircuit

combinationalcircuit

Stochastic LogicProbability values are the input and output signals.

parallel bit streams

4/8

3/8

4/8

8/8

5/8

3/8

combinationalcircuit

RandomnessAnalog interface with fractional weighting of 1’s.

parallel bit streams

A/D

A/D

A/D

A/D

A/DA/D

combinationalcircuit

RandomnessAnalog interface with fractional weighting of 1’s.

parallel bit streams

LFSR

LFSR

LFSR

Accumulator

AccumulatorLFSR

A

VDD

A{{

N Wires

M Wires

Nanowire Crossbar (idealized)

Randomized connections,yet nearly one-to-one.

Fault Tolerance

Conventional approach: binary radix encoding.

0.111 (7/8)

0.010 (2/8)0.001 (1/8)

Fault Tolerance

Bit flips can result in large error.

Conventional approach: binary radix encoding.

0.111 (7/8)

0.110 (6/8)0.101 (5/8)

Fault Tolerance

0111111… (7/8)

1100000… (2/8)01000000… (1/8)

Stochastic Logic

AND

• Highly redundant.• Complex operations can be performed with simple logic.

Fault Tolerance

0111111… (7/8)

1100100… (3/8)01000100… (2/8)

Stochastic Logic

• Highly redundant.• Complex operations can be performed with simple logic.

AND

Bit flips never result in large errors.

Arithmetic Operations

AND

A

BC

Multiplication (Scaled) Addition

ba

BPAP

CPc

)()(

)(

)

)1(

()](1[)()(

)(

bsas

BPSPAPSP

CPc

A

BC

MUX

S

0

1

Synthesizing Stochastic Logic

combinationalcircuit

)(tgt

Only polynomials…

Questions:

• What kinds of functions can be implemented in the probabilistic domain?

• How can we synthesize the logic to implement these?

Synthesizing Polynomials

combinationalcircuit

)(tgt

Only polynomials…

• Implement polynomials using AND (multiplication) and MUX (scaled addition).

• Must consider polynomials with coefficients less than 0 or larger than 1…

A little math…

( ) (1 ) ,n i n ii

nB t t t

i

0,1, ,i n

Bernstein basis polynomial of degree n

A little math…

( ) (1 ) ,n i n ii

nB t t t

i

0,1, ,i n

0

( ) ( )n

n n ni i

i

B t b B t

Bernstein basis polynomial of degree n

Bernstein polynomial of degree n

nib is a Bernstein coefficient

A little math…

0

( ),

( )

iijn n

i jnj j

b a

0,1, ,i n

Obtain Bernstein coefficients from power-form coefficients:

Given0 0

( ) ( )n n

n i n ni i i

i i

g t a t b B t

, we have

Example: Converting a Polynomial

32 683)( ttttg

)()(3

2)()( 3

332

31 tBtBtBtg

)()(5

2)(

5

3

)()(4

1)(

6

1)(

4

3

55

52

51

44

43

42

41

tBtBtB

tBtBtBtB

Power-Form Polynomial

Bernstein Polynomial

coefficients in unit interval

Synthesizing Polynomials

combinationalcircuit )(tgt

Synthesis steps:

1. Convert the polynomial into a Bernstein form.

2. Elevate it until all coefficients are in the unit interval.

3. Implement this with “generalized multiplexing”.

Probabilistic Multiplexing

A

BC

MUX

T

)

)1(

()](1[)()(

)(

btat

BPTPAPTP

CPc

Bernstein polynomial

X1, …, Xn are independent Boolean random variables with Pr(Xi=1) = t, for 1 ≤ i ≤ n

Z0, …, Zn are independent Boolean random variables with Pr(Zi=1)= , for 0 ≤ i ≤ n

nib

n

i

ni

ni tBbY

0

)()1Pr(

Probabilistic Multiplexing

A Reconfigurable Architecture

Implement different functions by setting the coefficients:

n

i

ni

ni tBbY

0

)()1Pr(

32

4

5

8

15

8

9

4

1)( ttttf

Example

Implement

Example

Convert to )(8

6)(

8

3)(

8

5)(

8

2)( 3

332

31

30 tBtBtBtBtf

0,0,0,1,1,0,1,1 (4/8)

0,1,1,1,0,0,1,0 (4/8)

1,1,0,1,1,0,0,0 (4/8)

0,0,0,1,0,1,0,0 (2/8)

+x1

x2

x3

1,2,1,3,2,0,2,1

0,1,0,1,0,1,1,1 (5/8)

0,1,1,0,1,0,0,0 (3/8)

1,1,1,0,1,1,0,1 (6/8)

MUX 0,1,0,0,1,1,0,1 (4/8)

z0

z1

z2

z3

y

0

1

2

3

Example

)(8

6)(

8

3)(

8

5)(

8

2)( 3

332

31

30 tBtBtBtBtf

0

( ) ( )n

n n ni i

i

B t b B t

with , such that 10 n

ib

1

0

2))()(( dttBtf n

is minimized.

Non-Polynomial Functions

Find a Bernstein polynomial to approximate the function:

Non-Polynomial Functions

Example: Gamma correction function.

Degree 6 Bernstein coefficients are:

b0 = 0.0955, b1 = 0.7207, b2 = 0.3476, b3 = 0.9988,b4 = 0.7017, b5 = 0.9695, b6 = 0.9939

f (t) = t 0.45

Deterministic v.s. Stochastic Implementation of Gamma correction function with 10% noise injection.

Conventional Implementation

Stochastic Implementation

1% 2% 10%

Stochastic Implementation: no pixels with errors > 20%!

Deterministic implementation:37% pixels with errors > 20%

Comparison with Conventional Hardware Implementation of Image Processing Functions

* The entire ReSC architecture, including Randomizers and De-Randomizers.** The ReSC Unit by itself.

Number of LUTs in FPGA mapping

* Software using math function from ‘Math.h’

Speedup (1024 cycles needed)

** Software using direct function table lookup

Comparison with Conventional Software Implementation of Image Processing Functions

Percentage of Output Pixels with Errors Greater than 25%

Noise is injected in the form of a percentage of bit flips.

Comparison of Fault Tolerance for Image Processing Functions

The stochastic implementation never produces such errors!

Sixth-order Maclaurin polynomial approx., 10 bits:sin(x), cos(x), tan(x), arcsin(x), arctan(x), sinh(x),

cosh(x), tanh(x), arcsinh(x), exp(x), ln(x+1)

0

10

20

30

40

50

60

0 0.001 0.002 0.005 0.01 0.02 0.05 0.1

error ratio of input data

rela

tiv

e e

rro

r

Stochastic Deterministic

Comparison of Fault Tolerance for Mathematical Functions

Conclusions

• The hardware cost is comparable.• Stochastic computation is much more error tolerant.• Advantage for applications where large errors are critical but

small fluctuations can be tolerated is dramatic.• (Also some pretty interesting math…)

Future Directions

• Apply the method at the processor level.• Apply the method at the circuit level (e.g., with PCMOS).

Quantities of Different

Types

ProbabilityDistribution

on outcomes

BiologicalProcess

[computational] Synthetic Biology

[computational] Synthetic Biology

Z

YX

XPrwith

Y

X

fixedBiologicalProcess

Recommended