Chalmers 2011

8/13/2019 Chalmers 2011

1/146

Eliminating the Hardware/Software Divide

Satnam Singh, Microsoft Research Cambridge, UK

8/13/2019 Chalmers 2011

2/146

!

IRQ, NMI

8/13/2019 Chalmers 2011

3/146

8/13/2019 Chalmers 2011

4/146

8/13/2019 Chalmers 2011

5/146

8/13/2019 Chalmers 2011

6/146

8/13/2019 Chalmers 2011

7/146

8/13/2019 Chalmers 2011

8/146

8/13/2019 Chalmers 2011

9/146

8/13/2019 Chalmers 2011

10/146

8/13/2019 Chalmers 2011

11/146

8/13/2019 Chalmers 2011

12/146

8/13/2019 Chalmers 2011

13/146

8/13/2019 Chalmers 2011

14/146

locks

monitors

condition variablesspin locks

priority inversion

8/13/2019 Chalmers 2011

15/146

8/13/2019 Chalmers 2011

16/146

8/13/2019 Chalmers 2011

17/146

8/13/2019 Chalmers 2011

18/146

8/13/2019 Chalmers 2011

19/146

8/13/2019 Chalmers 2011

20/146

multiple

independent

multi-ported

memories

fine-grain

parallelism

and

pipelining

hard and softembedded

processors

8/13/2019 Chalmers 2011

21/146

LUTs are just higher order functions

i o

lut1

oi1

i0

lut2 lut3

i0

i1

i2

o

lut4

i0

i1

i2i3

o

inv = lut1not

and2 = lut2(&&)

mux = lut3(ls d0 d1 . ifs thend1 elsed0)

8/13/2019 Chalmers 2011

22/146

XC6VLX760 758,784 logic cells, 864 DSP blocks,

1,440 dual ported 18Kb RAMs

32-bit

integer

Adder

(32/474,240)

>700MHz

332x1440

14820 sim-adds

1,037,400,000,000additions/second

8/13/2019 Chalmers 2011

23/146

8/13/2019 Chalmers 2011

24/146

XD2000i FPGA in-socketaccelerator for Intel FSB XD2000F FPGA in-socketaccelerator for AMD socket F XD1000 FPGA co-processormodule for socket 940

8/13/2019 Chalmers 2011

25/146

8/13/2019 Chalmers 2011

26/146

Case StudySpam Filtering(Alessandro Forin, MSR Redmond)

Benchmark

~50,000 regular expressions fromForefront Team (snapshot from

their Exchange server in Aug 09)

Performance Up to 6000x faster than standard Intel processors

Capable of processing at line rate of gigabit Ethernet

Power Requirement

7

10 watts rather than 200++ watts

8/13/2019 Chalmers 2011

27/146

27

Software Version FPGA Version

~6000 Messages/Sec~1 Message/Sec

8/13/2019 Chalmers 2011

28/146

8/13/2019 Chalmers 2011

29/146

8/13/2019 Chalmers 2011

30/146

Ren Mller (ETH)

FPGAs + SQL [VLDB]

8/13/2019 Chalmers 2011

31/146

CPU FPGA

8/13/2019 Chalmers 2011

32/146

8/13/2019 Chalmers 2011

33/146

8/13/2019 Chalmers 2011

34/146

Speed Grade -1 -2 -3With Layout 270MHz 320MHz 362MHzWithout Layout 210MHz 260MHz 280MHz

541 seconds

1896 seconds

8/13/2019 Chalmers 2011

35/146

opportunityscientific computingdata mining

search

image processing

financial analytics

challenge

8/13/2019 Chalmers 2011

36/146

The Accidental Semi-colon

8/13/2019 Chalmers 2011

37/146

publicstaticint[] SequentialFIRFunction(int[] weights, int[] input)

{

int[] window = newint[size];

int[] result = newint[input.Length];

// Clear to window of x values to all zero.

for(intw = 0; w < size; w++)

window[w] = 0;

// For each sample...

for(inti = 0; i < input.Length; i++){

// Shift in the new x value

for(intj = size - 1; j > 0; j--)

window[j] = window[j - 1];

window[0] = input[i];

// Compute the result value

intsum = 0;

for(intz = 0; z < size; z++)

sum += weights[z] * window[z];

result[i] = sum;

}

returnresult;

}

8/13/2019 Chalmers 2011

38/146

PLDI 1998

8/13/2019 Chalmers 2011

39/146

PLDI 2003

1

2

3

4

5

8/13/2019 Chalmers 2011

40/146

PLDI 2010

0

10

20

30

40

50

60

70

80

1 2 3 4 5 6

Series1

Series2

8/13/2019 Chalmers 2011

41/146

POPL 1998

8/13/2019 Chalmers 2011

42/146

POPL 2002

8/13/2019 Chalmers 2011

43/146

POPL 2010

8/13/2019 Chalmers 2011

44/146

8/13/2019 Chalmers 2011

45/146

ray of light

Signal

Liquid

Metal

PRET-C

Bluespec

Feldspar

Accelerator

RapidMind /Ct

Streams-C

Esterel

SHIM

8/13/2019 Chalmers 2011

46/146

universallanguage?

embeddedhigh level

software

FPGA

GPU

DSP

machine

learning

grand unification

theorypolygots

Gannet

DSLs

8/13/2019 Chalmers 2011

47/146

Our High Level Synthesis Projects

Kiwi: concurrent

C# programs for

control-oriented

applications

[David Greaves,

Univ. Cambridge]

shape analysis:

synthesis of

dynamic data

structures (C)[MPI and CMU]

Accelerator/FPGA:

synthesis of data

parallel programs in

C++/C#/F#[MSR Redmond]

HLINQ

eDSLs

[Gavin Bierman]

+ compilation of self-recursive Haskell functions to FPGA circuits!

8/13/2019 Chalmers 2011

48/146

Redmond Accelerator TeamBarry Bond

Kerry Hammil

Lubomir Litchev

8/13/2019 Chalmers 2011

49/146

Effort vs. Reward

loweffort

low

reward

higheffort

high

reward

mediumeffort

medium

reward

CUDA

OpenCL

HLSL

DirectComputeAccelerator

8/13/2019 Chalmers 2011

50/146

Accelerator

8/13/2019 Chalmers 2011

51/146

Application.EXE Accelerator.DLL

Windows on Intel/AMD

processor

DX9

GPU

(ATI, Nvidia, )

8/13/2019 Chalmers 2011

52/146

openSystemopenMicrosoft.ParallelArraysletmain(args) =

letx = newFloatParallelArray (Array.map float32 [|1; 2; 3; 4; 5|])lety = newFloatParallelArray (Array.map float32 [|6; 7; 8; 9; 10|])letz = x + yusedx9Target = newDX9Target()letzv = dx9Target.ToArray1D(z)printf "%A\n"zv

0

8/13/2019 Chalmers 2011

53/146

openSystemopenMicrosoft.ParallelArraysletmain(args) =

letx = newFloatParallelArray (Array.map float32 [|1; 2; 3; 4; 5|])lety = newFloatParallelArray (Array.map float32 [|6; 7; 8; 9; 10|])letz = x + yusesse3Target = newX64MulticoreTarget()letzv = sse3Target.ToArray1D(z)printf "%A\n"zv

0

8/13/2019 Chalmers 2011

54/146

openSystemopenMicrosoft.ParallelArrays

letmain(args) =letx = newFloatParallelArray (Array.map float32 [|1; 2; 3; 4; 5|])lety = newFloatParallelArray (Array.map float32 [|6; 7; 8; 9; 10|])letz = x + yusefpgaTarget = newFPGAMulticoreTarget()fpgaTarget.ToArray1D(z)0

8/13/2019 Chalmers 2011

55/146

8/13/2019 Chalmers 2011

56/146

8/13/2019 Chalmers 2011

57/146

[1; 2; 3; 4; 5]

FloatParallelArray

CPU Address Space

F# Array

GPU Address Space

Encapsulated

Data-parallel

array

x

[6; 7; 8; 9; 10]

FloatParallelArray

y

100010101101011010

x+yGPU code

GPU memory

GPU code

y

[7; 9; 11; 13; 15]

F# Array

8/13/2019 Chalmers 2011

58/146

usingSystem;

usingMicrosoft.ParallelArrays;

namespaceAddArraysPointwise{

classAddArraysPointwiseDX9{

staticvoidMain(string[] args)

{varx = newFloatParallelArray(new[] {1.0F, 2, 3, 4, 5});vary = newFloatParallelArray(new[] {6.0F, 7, 8, 9, 10});vardx9Target = newDX9Target();varz = x + y;foreach(variindx9Target.ToArray1D (z))

Console.Write(i+ " ");Console.WriteLine();}

}}

8/13/2019 Chalmers 2011

59/146

moduleMainwhere

importAccelerator

x = fpa [1.0, 2.0, 3.0, 4.0, 5.0]y = fpa [6.0, 7.0, 8.0, 9.0, 10.0]

z = x + y

main= dodx9Target

8/13/2019 Chalmers 2011

60/146

8/13/2019 Chalmers 2011

61/146

8/13/2019 Chalmers 2011

62/146

8/13/2019 Chalmers 2011

63/146

8/13/2019 Chalmers 2011

64/146

8/13/2019 Chalmers 2011

65/146

rX *

pa

Shift

(0,0)k[0]

+

+

*

Shift

(0,1) k[1]

+

letrecconvolve (shifts : int ->int [])

(kernel : float32 []) i(a : FloatParallelArray)= lete = kernel.[i] * ParallelArrays.Shift(a, shifts i)ifi = 0thene

else

e + convolve shifts kernel (i-1) a

8/13/2019 Chalmers 2011

66/146

8/13/2019 Chalmers 2011

67/146

8/13/2019 Chalmers 2011

68/146

8/13/2019 Chalmers 2011

69/146

8/13/2019 Chalmers 2011

70/146

publicstaticint[] SequentialFIRFunction(int[] weights, int[] input)

{

int[] window = newint[size];int[] result = newint[input.Length];

// Clear to window of x values to all zero.

for(intw = 0; w < size; w++)

window[w] = 0;

// For each sample...

for(inti = 0; i < input.Length; i++){

// Shift in the new x value

for(intj = size - 1; j > 0; j--)

window[j] = window[j - 1];

window[0] = input[i];

// Compute the result value

intsum = 0;for(intz = 0; z < size; z++)

sum += weights[z] * window[z];

result[i] = sum;

}

returnresult;

}

8/13/2019 Chalmers 2011

71/146

8/13/2019 Chalmers 2011

72/146

8/13/2019 Chalmers 2011

73/146

shift (x, 0) = [7, 2, 5, 9, 3, 8, 6, 4] = x

shift (x, -1) = [7, 7, 2, 5, 9, 3, 8, 6]

shift (x, -2) = [7, 7, 7, 2, 5, 9, 3, 8]

8/13/2019 Chalmers 2011

74/146

y = [y[0], y[1], y[2], y[3], y[4], y[5], y[6], y[7]]

= a[0] * [x[0], x[1], x[2], x[3], x[4], x[5], x[6], x[7]] +a[1]* [x[-1], x[0], x[1], x[2], x[3], x[4], x[5], x[6]] +

a[2] * [x[-2], x[-1], x[0], x[1], x[2], x[3], x[4], x[5]] +

a[3]* [x[-3], x[-2], x[-1], x[0], x[1], x[2], x[3], x[4]] +

a[4]* [x[-4], x[-3], x[-2], x[-1], x[0], x[1], x[2], x[3]]

y= a[0] * shift (x, 0) +

a[1] * shift (x, -1) +

a[2] * shift (x, -2) +

a[3] * shift (x, -3) +a[4] * shift (x, -4)

8/13/2019 Chalmers 2011

75/146

usingMicrosoft.ParallelArrays;usingA= Microsoft.ParallelArrays.ParallelArrays;namespaceAcceleratorSamples{

publicclassConvolver{

publicstaticfloat[] Convolver1D(TargetcomputeTarget,

float[] a, float[] x){

varxpar = newFloatParallelArray(x);varn = x.Length;varypar= newFloatParallelArray(0.0f, new[] { n });for(inti= 0; i< a.Length; i++)

ypar+= a[i] * A.Shift(xpar, -i);float[] result = computeTarget.ToArray1D(ypar);returnresult;

}}

}

for(inti= 0; i< a.Length; i++)ypar+= a[i] * A.Shift(xpar, -i);

8/13/2019 Chalmers 2011

76/146

8/13/2019 Chalmers 2011

77/146

usingSystem;usingSystem.Linq;usingMicrosoft.ParallelArrays;namespaceAcceleratorSamples{

staticclassConvolver2D{

staticFloatParallelArrayconvolve(thisFloatParallelArraya,Func shifts, float[] kernel)

{returnkernel

.Select((k, i) => k * ParallelArrays.Shift(a, shifts(i)))

.Aggregate((a1, a2) => a1 + a2);}staticFloatParallelArrayconvolveXY(thisFloatParallelArrayinput, float[] kernel){

returninput

.convolve(i => new[] { -i, 0}, kernel)

.convolve(i => new[] { 0, -i }, kernel);}staticvoidMain(string[] args){

constintinputSize = 10;varrandom = newRandom(42);varinputData = newfloat[inputSize, inputSize];for(introw= 0; row< inputSize; row++)

for(intcol= 0; col< inputSize; col++)inputData[row, col] = (float)random.NextDouble() * random.Next(1, 100);

vartestKernel = new[] { 2F, 5, 7, 4, 3};vardx9Target = newDX9Target();varinputArray = newFloatParallelArray(inputData);varresult = dx9Target.ToArray2D(inputArray.convolveXY(testKernel));for(varrow= 0; row< inputSize; row++){

for(intcol= 0; col< inputSize; col++)Console.Write("{0} ", result[row, col]);

Console.WriteLine();}

}}

}

staticFloatParallelArrayconvolve(thisFloatParallelArraya,Func shifts,float[] kernel)

{returnkernel.Select((k, i) => k * ParallelArrays.Shift(a, shifts(i)))

.Aggregate((a1, a2) => a1 + a2);}

staticFloatParallelArrayconvolveXY(thisFloatParallelArrayinput,float[] kernel)

{ returninput

.convolve(i => new[] { -i, 0}, kernel)

.convolve(i => new[] { 0, -i }, kernel);}

8/13/2019 Chalmers 2011

78/146

openSystemopenMicrosoft.ParallelArrays[]letmain(args) =

// Declare a filter kernel for the convolution lettestKernel = Array.map float32 [| 2; 5; 7; 4; 3|]// Specify the size of each dimension of the input arrayletinputSize = 10// Create a pseudo-random number generatorletrandom = Random (42)

// Declare a psueduo-input data arraylettestData = Array2D.init inputSize inputSize (funi j ->float32 (random.NextDouble() *

float (random.Next(1, 100))))

// Create an Accelerator float parallel array for the F# input array usetestArray = new FloatParallelArray(testData)// Declare a function to convolve in the X or Y direction letrecconvolve (shifts : int ->int []) (kernel : float32 []) i (a : FloatParallelArray)

= lete = kernel.[i] * ParallelArrays.Shift(a, shifts i)ifi = 0then

eelse

e + convolve shifts kernel (i-1) a// Declare a 2D convolverletconvolveXY kernel input

= // First convolve in the X direction and then in the Y directionletconvolveX = convolve (funi ->[| -i; 0|]) kernel (kernel.Length - 1) inputletconvolveY = convolve (funi ->[| 0; -i |]) kernel (kernel.Length - 1) convolveXconvolveY

// Create a DX9 target and use it to convolve the test input usedx9Target = newDX9Target()letconvolveDX9 = dx9Target.ToArray2D (convolveXY testKernel testArray)printfn "DX9: -> \r\n%A"convolveDX90

letconvolveXY kernel input= // First convolve in the X direction and then in Y

letconvolveX = convolve (funi ->[| -i; 0|]) kernel(kernel.Length - 1) inputletconvolveY = convolve (funi ->[| 0; -i |]) kernel

(kernel.Length - 1) convolveXconvolveY

8/13/2019 Chalmers 2011

79/146

0

5

10

15

20

25

0 5 10 15 20 25 30 35 40 45

speedupo

veronec

ore

kernel size

x64 multicore target benchmark for 2D convolver

(24 core server Xeon E7540)

6 core speedup

12 core speedup

18 core speedup

24 core speedup

8/13/2019 Chalmers 2011

80/146

8/13/2019 Chalmers 2011

81/146

Convolver

8/13/2019 Chalmers 2011

82/146

8/13/2019 Chalmers 2011

83/146

8.249ns max delay

3 x DSP48Es

63 slice registers

24 slice LUTs

8/13/2019 Chalmers 2011

84/146

8/13/2019 Chalmers 2011

85/146

8/13/2019 Chalmers 2011

86/146

8/13/2019 Chalmers 2011

87/146

8/13/2019 Chalmers 2011

88/146

8/13/2019 Chalmers 2011

89/146

DRAM

8/13/2019 Chalmers 2011

90/146

technology node 130nm CMOS

(2006)

45nm CMOS

(2008)

transfer 32b

across-chip

20 computations 57 computations

transfer 32boff-chip 260 computations 1300 computations

Power of Computation vs. Communication

numbers derived from work by W. Dally, Stanford

8/13/2019 Chalmers 2011

91/146

8/13/2019 Chalmers 2011

92/146

8/13/2019 Chalmers 2011

93/146

Virtex-6 XC6VLX240T-1

317MHz

Area 6%

108 DSP48E1 (out of a768)

2.7W (at 25C)

1,110mega-samples per second

cf. CUDA version on 470GTX at

552 mega-samples per second

(single precision)

8/13/2019 Chalmers 2011

94/146

Kiwi Thesis

thread

2

thread

3

thread

1

8/13/2019 Chalmers 2011

95/146

8/13/2019 Chalmers 2011

96/146
http://upload.wikimedia.org/wikipedia/fr/b/b4/Mono_project_logo.svg

8/13/2019 Chalmers 2011

97/146
http://www.amazon.com/gp/product/images/0470052627/ref=dp_image_0?ie=UTF8&n=283155&s=books

8/13/2019 Chalmers 2011

98/146

8/13/2019 Chalmers 2011

99/146

8/13/2019 Chalmers 2011

100/146
http://msdn.microsoft.com/en-us/devlabs/dd491992.aspx

8/13/2019 Chalmers 2011

101/146

Kiwi

structural imperative (C)parallel

imperative

gate-level

VHDL/Verilog KiwiC-to-

gates

&0

0

0

Q

QSET

CLR

S

R

;

;

;

jpeg.cthread

2

thread

3

thread

1

8/13/2019 Chalmers 2011

102/146

8/13/2019 Chalmers 2011

103/146

8/13/2019 Chalmers 2011

104/146

8/13/2019 Chalmers 2011

105/146

Kiwi

Library

Kiwi.cs

circuit

model

JPEG.cs

Visual Studio

multi-thread simulation

debuggingverification

Kiwi Synthesis

circuit

implementation

JPEG.v

8/13/2019 Chalmers 2011

106/146

parallel

program

C#

Thread 1

Thread 2

Thread 3

Thread 3

C to

gates

C to

gates

C to

gates

C to

gates

circuit

circuit

circuit

circuit

Verilog

for system

8/13/2019 Chalmers 2011

107/146

Ports and Clockspublicst ticclass I2C

{ [OutputBitPort("scl")]

st ticbool scl;[InputBitPort("sda_in")]

st ticbool sda_in;[OutputBitPort("sda_out")]

st ticbool sda_out;[OutputBitPort("rw")]

st ticbool rw;

circuit portsidentified by

custom attribute

8/13/2019 Chalmers 2011

108/146

publicst ticint max2(int a, int b){ int result;if(a > b)

result = a;elseresult = b;

returnresult;}

.method public hidebysig staticint32max2(int32 a,

int32 b) cil managed{// Code size 12 (0xc).maxstack 2

.locals init ([0] int32 result)IL_0000: ldarg.0IL_0001: ldarg.1IL_0002: ble.s IL_0008

IL_0004: ldarg.0

IL_0005: stloc.0IL_0006: br.s IL_000a

IL_0008: ldarg.1IL_0009: stloc.0IL_000a: ldloc.0IL_000b: ret

}

max2(3, 7)

stack

local memory

0

3

7

7

7

8/13/2019 Chalmers 2011

109/146

Writing to a Channel

publicclassChannel

{

T datum;

boolempty = true;

publicvoidWrite(T v)

{

lock(this)

{

while(!empty)

Monitor.Wait(this);

datum = v;empty = false;

Monitor.PulseAll(this);

}

}

8/13/2019 Chalmers 2011

110/146

Reading from a Channel

publicT Read()

{

T r;

lock(this)

{

while(empty)

Monitor.Wait(this);

empty = true;

r = datum;

Monitor.PulseAll(this);}

returnr;

}

8/13/2019 Chalmers 2011

111/146

8/13/2019 Chalmers 2011

112/146

systems level concurrency constructs

threads, events, monitors, condition variables

rendezvous join patternstransactional

memory

data

parallelism

userapplications

domain specificlanguages

8/13/2019 Chalmers 2011

113/146

8/13/2019 Chalmers 2011

114/146

Filter Example

thread one-place

channel

8/13/2019 Chalmers 2011

115/146

Transposed Filter

8/13/2019 Chalmers 2011

116/146

8/13/2019 Chalmers 2011

117/146

Inter-thread Communication and

Synchronization

// Create the channels to link together the taps

for(intc = 0; c < size; c++){

Xchannels[c] = newKiwi.Channel();

Ychannels[c] = newKiwi.Channel();

Ychannels[c].Write(0); // Pre-populate y-channel registers with zeros

}

8/13/2019 Chalmers 2011

118/146

// Connect up the taps for a transposed filter

for(inti = 0; i < size; i++)

{

intj = i; // Quiz: why do we need the local j?

ThreadtapThread = newThread(delegate() { Tap(j, weights[j],

Xchannels[j],

Ychannels[j],

Ychannels[j+1]); });

tapThread.Start();}

8/13/2019 Chalmers 2011

119/146

8/13/2019 Chalmers 2011

120/146

8/13/2019 Chalmers 2011

121/146

8/13/2019 Chalmers 2011

122/146

t ti bli id h ()

8/13/2019 Chalmers 2011

123/146

staticpublicvoidecho(){

tx_sof_n = !false; // We are not at the start of a frametx_src_rdy_n = !false;

tx_eof_n = !false; // We are not at the end of a frameboolstart = !rx_sof_n && !rx_src_rdy_n; // The start conditioninti, j;booldoneReading;

while(true) // Process packets indefinately{

// Wait for SOF and SRC_RDYwhile(!start)

{Kiwi.Pause(); // Wait for a clock tickstart = !rx_sof_n && !rx_src_rdy_n; // Check for start of frame

}// Read in the entire framei = 0;doneReading = false;

// Read the remaining bytes

while(!doneReading){

if(!rx_src_rdy_n){

buffer[i] = rx_data;i++;

}doneReading = !rx_eof_n;Kiwi.Pause();

}

8/13/2019 Chalmers 2011

124/146

8/13/2019 Chalmers 2011

125/146

C#

softprocessor

8/13/2019 Chalmers 2011

126/146

8/13/2019 Chalmers 2011

127/146

fib :: Int -> Intfib 0 = 0fib 1 = 1

fib n= n1 + n2wheren1 = fib (n

1)n2 = fib (n - 2)

8/13/2019 Chalmers 2011

128/146

STATE 1FREEPRECASE

ds1 := dsCASEds1WHEN1 =>RETURN 1

WHEN0 =>RETURN0

WHENothers =>v0 := ds1 - 2RECURSE[v0] 2 [ds1]

ENDCASESTATE 3 FREE n2n1 := resultInt

v2 := n1 + n2RETURNv2

STATE 2 FREE ds1n2 := resultIntv1 := ds1 - 1RECURSE[v1] 3 [n2]

8/13/2019 Chalmers 2011

129/146

8/13/2019 Chalmers 2011

130/146

8/13/2019 Chalmers 2011

131/146

8/13/2019 Chalmers 2011

132/146

8/13/2019 Chalmers 2011

133/146

8/13/2019 Chalmers 2011

134/146

8/13/2019 Chalmers 2011

135/146

8/13/2019 Chalmers 2011

136/146

relocation viavirtualization???

8/13/2019 Chalmers 2011

137/146

+ encryption + virtualization +

data-processing

no standard ABIno FPGA-kernel-userspace model

The cloud is just an extension ofexisting OS paradigms FPGAs getleft behind they lack abstractionboundaries

Split Trust

8/13/2019 Chalmers 2011

138/146

Split Trust

managingphysical devicevs.

usinga physical devicemanagement

domain?

8/13/2019 Chalmers 2011

139/146

FPGAs Improve Cloud Security

8/13/2019 Chalmers 2011

140/146

8/13/2019 Chalmers 2011

141/146

8/13/2019 Chalmers 2011

142/146

8/13/2019 Chalmers 2011

143/146

8/13/2019 Chalmers 2011

144/146

8/13/2019 Chalmers 2011

145/146

8/13/2019 Chalmers 2011

146/146

Documents

Chalmers 2011